MediaCodec and OpenMAX: Feature-based Video Codec API Comparison

Friday 6/21/13 07:53am
|
Posted By David Sainte-Claire
  • Up0
  • Down0

A new video codec API arrived with the Jelly Bean release of Android (API level 16) called MediaCodec. With this new feature, you can access hardware-level encode / decode functionality from within the Java application space. Additionally, because you are given access to the input and output buffers of the hardware components, the MediaCodec API provides greater flexibility with regard to what input sources can be used given that the image data can come from any source you choose (individual image frames or an actual video stream similar to MediaRecorder).  Also, in the case of using camera preview frames, it is possible to manipulate the images prior to handing the data off to MediaCodec for encoding. In contrast, the MediaRecoder API puts the camera into video capture mode and the image data is fed directly to the encoder and no post-processing can occur.  While this enhancement is a welcome addition, there are important features that MediaCodec does not currently support; some of which are accessible via the Khronos OpenMAX framework directly in the native C++ layer. In this post, I’ll discuss some of these differences with the intent of enabling you to decide which route to take when implementing your application.

Static Configuration Features

Configuration of the hardware encoder components can be broken down into two specific categories, parameters that are set statically when the component is initialized, and those that can be manipulated dynamically after the component is already in use. In the case of static configuration, neither MediaCodec nor direct OpenMAX calls have any functionality advantage as both frameworks support the following features:

  • Codec and profile / level selection
  • Frame Rate
  • Bit Rate

In contrast, neither MediaCodec nor Qualcomm Innovation Center’s (QuIC) implementation of the OpenMAX Integration Layer (IL) client currently support the following static configuration features:

  • Rate Control Mode (Constant Bit Rate, Variable Frame Rate)
  • Slice Size
  • Group of Picture (GOP) Size
  • Long Term Reference Picture (LTRP) Signals
  • Hierarchical P Coding

Dynamic Configuration Features

While both frameworks mirror each other when it comes to static configuration, only the OpenMAX IL client currently supports dynamic IDR frame insertion during encoding. Depending on your specific use-case, this may or may not be important. In video telephony applications, for example, periodically inserting IDR frames into the data stream helps to reestablish a baseline for the decoder in the event that previous frames were lost in transit.

Finally, the following is a list of dynamic configuration features not currently supported by either MediaCodec or the QuIC OpenMAX IL client:

  • Bit Rate
  • Frame Rate
  • LTRP Signal
  • Hierarchical P Bit Rate Per Layer Reconfiguration

On the decode side of things, neither MediaCodec nor the OpenMAX IL client currently has any functionality advantage of the other. Both support port reconfiguration events to respond to asynchronous sequence parameter sets (SPS) and picture parameter sets (PPS) embedded into the video stream. Also, neither framework currently supports selection for error concealment mode while decoding.

Get Started

If you’re looking to build an application to take advantage of the low-level multimedia components on devices featuring Qualcomm® Snapdragon™ 800 processors, including the Snapdragon 800 Mobile Development Platforms, using MediaCodec and the OpenMAX IL client are both good options. If you need the ability to manually trigger an IDR frame, only the latter will presently meet your needs.

There is a video codec sample code for Android available that can help get you up and running. And if you have questions, you can visit the Multimedia Optimization Forum and we’ll be happy to help you.

Comments

Re: MediaCodec and OpenMAX: Feature-based Video Codec API...

This document does not mention static keyframe insertion.

The MediaCodec API specifies this behaviour (number of seconds between each keyframe can be specified).

However, in my tests, none of the phones available to me with Qualcomm chipsets follows this: even if I specify a keyframe interval of 1 second, only one keyframe (the first frame) is created with the MediaCodec API. Is this intentional?

Re: MediaCodec and OpenMAX: Feature-based Video Codec API...

Hello Balint,

The number of seconds between key frames has to be specificed on the MediaFormat when configuring the encoder.

void

configure(MediaFormat format, Surface surface, MediaCrypto crypto, int flags)

I am wondering if when you specified a value for KEY_I_FRAME_INTERVAL if said value was a large number since you are only getting the first frame.

Regards,

Cary

Re: MediaCodec and OpenMAX: Feature-based Video Codec API...

Hi Cary,

As I already mentioned I do specify the KEY_I_FRAME_INTERVAL to 1, and expect to have a keyframe at approx. every second.But only the first frame during the encoding is a keyframe.

This can be easily checked with the reference AVC decoder (ldecod) and reproducible on various Qualcomm chipsets, for example ZTE Blade V, Samsung Galaxy S2, HTC Desire X.

This issue may be related to an official Android ticket, see:

http://code.google.com/p/android/issues/detail?id=61977

The bug may be a parameter-passing problem between Java and JNI/native code for the MediaFormat component, since the built-in media recorder which uses OMXCodec directly, produces a good H.264 video stream. But I cannot confirm this, since I do not have source code for the official ROMs for the devices specified above.

With this bug, creating a seekable or streamable video is impossible.

 

Re: MediaCodec and OpenMAX: Feature-based Video Codec API...

are you sure KEY_I_FRAME_INTERVAL = 1 means one key frame per second?  I am new to snapdragon but I have experience on other camera platforms. My experience is that right value could be KEY_I_FRAME_INTERVAL = 30, which means one key frame followed by 29(or maybe 30) non-IDR frames which will result in one key frame per second if the stream is 30fps.