Chinaunix首页 | 论坛 | 博客
  • 博客访问: 644490
  • 博文数量: 151
  • 博客积分: 3498
  • 博客等级: 中校
  • 技术积分: 1570
  • 用 户 组: 普通用户
  • 注册时间: 2005-02-28 18:10
文章分类

全部博文(151)

文章存档

2014年(12)

2013年(17)

2012年(17)

2011年(5)

2010年(12)

2009年(2)

2007年(26)

2006年(22)

2005年(38)

分类: LINUX

2005-03-04 09:18:43

What is MPEG-4?

MPEG-4 (ISO 14496) is a broad Open Standard developed by the Moving Picture Experts Group (MPEG), a working group of the International Organization for Standardization (ISO) which also did the well known MPEG-1 (MP3, VCD) and MPEG-2 (DVD, SVCD) Standards, standardizing all sorts of audio/video compression formats and much more
By its nature the MPEG-4 Standard doesnt aim at standardizing one potential product (eg something comparable to DVD) but covers a broad range of Sub-Standards, which Product Providers can choose from to follow, according to what they need for their product

The MPEG-4 Standard, as mentioned, is divided into many different sub-standards, where for us users on Doom9 the following parts might be of major interest:
- , Animation/Interactivity (like DVD Menus)
- , e.g. Advanced Simple Profile (ASP), as followed by XviD, DivX5, 3ivx...
- , Advanced Audio Coding (AAC)
- , Advanced Video Coding (AVC), also known as H.264
- , MP4 container format (uses the .mp4 extension)
- , MPEG-4 Timed Text subtitle format

This information thread now aims at providing some usefull infos on most of these parts, with a focus on MPEG-4 ASP and AVC/H.264


What are the possible advantages of an open standard, like MPEG-4, compared to closed formats, like Micro$oft's Windows Media?

The good thing about an open standard is that its open for everyone to follow when creating a product. therefore we already have a lot of different products which are compatible to the MPEG-4 Standard and are therefore also compatible to each other
Next to interoperability and big product range to choose from, an open standard leads to competition, which means for the consumer that products in the competitive market will most likely have a better increase in quality, lower prices and a better focus on the consumers needs

but not to forget and thats maybe the most important point for me:
an open standard allows open source development, like we all know from XviD for example

ISO 14496-1 (Systems) - MP4

As already mentioned the MPEG-4 Standard defines its own Container Format: MP4 (other container formats not covered by the Standard are for example AVI, OGM, Matroska aso...), which allows not only the storage of audio and video content but also of animated/interactive content (also known as BIFS), as defined in MPEG-4 Systems, as needed for DVD-like Menus for example (MP4 has now been moved from ISO 14496-1 to its own ISO 14496-14)

Interactivity/Animation

without getting into technical details the MPEG-4 Systems Standard defines a broad range of powerfull tools which allow all sorts of animations (similar to what we know from flash) or interactivity (for example as known from DVD Menus...)
These animations/interactivity can be done in both 2D and 3D

To check out some nice samples of what MPEG-4 Systems can provide have a look
Note that to playback systems files you also of course need a systems decoder/player, where the most popular ones for Systems 2D content are 's Osmo4 (supports now 3D too) or - for Systems 3D also have a look

Interoperability

the MP4 container is a very important part of the MPEG-4 Standard, as if you want to reach 100% interoperability between different MPEG-4 A/V implemenatations there is no way around using a standardised container too
on the contrary still the most popular container format is AVI, also for MPEG-4 video content, and the AVI container is also the major reason when it comes to incompatibilities between exisiting MPEG-4 stuff (ie on hardware players)

further documentation

If you want to read more about the MP4 container have a look at the in the forum
some documentation on MP4 is available from and
A FAQ especially about is available from the
also if you are interested in the described interactive content also have a look at this document from the project
A draft of the MPEG-4 Systems Specs can be downloaded
The Specs for the ISO base media file format (14496-12), on which MP4 is based on, can be found

ISO 14496-2 (Video) - Advanced Simple Profile (ASP) The MPEG-4 Standard defines a broad range of video coding tools. The atm most widely used ones are defined under ISO 14496-2, thats why this part of the Standard is also often called MPEG-4 "Part 2", but i will call it from now on MPEG-4 ASP as described below

MPEG-4 Part 2 Profiles

As already mentioned the MPEG-4 Standard aims at multiple applications. Of course for different usages also different coding tools are needed (ie if you want to stream video content at very low bitrates you will use different tools as if you want to make a dvd backup copy at medium or high bitrates)
to cover all these different needs the MPEG-4 Standard defines a lot of different Profiles and Levels. each profile/level is an interoperability/conformance point, ensuring that all products, following a specific profile/level, even if from different vendors, can work together.
these profiles/levels standardise not only what encoding tools can be used but also define for example the allowed bitrate-range, image-sizes, frame-rates aso...

For an overview on available MPEG-4 profiles have a look


Advanced Simple Profile (ASP)

When it comes to DVD Backups the Advanced Simple Profile @ Level 5 (ASP@L5) is the one to follow
It allows frame sizes up to 720x576, frame-rates up to 30fps and offers advanced coding tools like B-Frames (B-VOPS), QuarterPixel Motion Estimation (QPEL), Global Motion Compensation (GMC) and MPEG/Custom Quantization, in contrary to the the Simple Profile, which also only allows a max. frame size of 352x288 and 15fps for example

the most important Advanced Simple Profile tools (not available in Simple Profile) are:

B-Frames/B-VOPS/Bi-directional encoding/prediction:
In contrary to I-Frames/Keyframes (which include the entire image and dont depend on other frames) or P-Frames (which include only the changed parts of the image from the previous I- or P-Frame), B-Frames are constructed using data from the previous and next I- or P-Frame. B-Frames can be compressed much more than other frame types, which overall should help quality and compressibility.

Quarter Pixel Motion Search Precision (QPEL):
basically most MPEG-4 codecs by default detect motion between two frames down to half a pixel (HalfPel)
Now with QuarterPel you can detect motion that is only a quarter of a pixel per frame, effectively doubling precision!
practically this means that you will get a much sharper image with QPEL, try it and you will love it (that's my opinion of course )

Global Motion Compensation (GMC):
GMC detects if there is an amount of motion big parts of the frame have in common. If thats the case GMC kicks in, using a single motion vector for all similar parts of the frame instead of multiple ones.
practically this helps saving bits when panning, zoom or rotation occurs (depending on how good the GMC implementation is/offered warppoints), bits which than can be used somewhere else, for example where they bring more sharpness.

MPEG/Custom Quantization:
While with MPEG-4 Simple Profile you can only use the h.263 quantization type, the ASP also allows you to use custom ones
While the h.263 type will bring you a softer image (good for 1CD encodes), the default MPEG matrix is better for higher bitrates, preserving more details
a popular custom matrix is for example hvs_good, also nice for lower bitrates, but there are also many more

Adaptive Quantization:
When encoding with a Variable Bitrate (eg in 2pass) each frame can get compressed with a different quant (the higher the quant the smaller the size/bitrate of the frame). What frame gets what quant (eg compress high motion more) is decided by the "Rate Control".
With Adaptive Quantisation (also available in Simple Profile) the quant can additionally also differ inside each frame (eg high motion/dark parts of the frame get a higher quant/compressed more, faces get a lower quant than background, aso...)


basically MPEG-4 ASP became popular with the famous DivX5 Codec, which name is also often used to describe content following ASP@L5 (like people name all sorts of Cola, even Pepsi, as "Coca-Cola"), but its important to realise that there also exist other MPEG-4 ASP codecs which are not more or less compatible to MPEG-4 as DivX5:


available MPEG-4 ASP Codecs

ASP codecs are available atm from (), , , /ffvfw/ffdshow, , , , , , , and many more...
doom9's quality comparisons:
(note that DivX3.11 aka MS MPEG-4, RV9, VP6 and WMV9 are not MPEG-4 compliant!)

XviD
maybe one of the most advanced MPEG-4 ASP codecs, highly tuned for DVD Backups and offering a very broad range of encoding tools (including more than 1 B-Frames, QPEL, GMC (3 warppoints), h.263/MPEG/Custom Quants, Adaptive Quants, Trellis and much more)
XviD is open source (GPL) and THE codec of the Doom9 community, where it also has its own , where the XviD developers are also often around (as i am sure you know already )
For more infos about XviD visit the official Homepage , read and of course the on doom9

DivX5
maybe the most popular and most widely used MPEG-4 ASP Codec, mainly living from its name tough. It offers less MPEG-4 ASP features (up to 2 B-Frames, h.263/MPEG Quant, a weak GMC (1 warppoint), QPEL) and some people claim also less quality than XviD
still its THE codec in the business world and the codec which made MPEG-4 ASP popular
For more infos about DivX5 visit the official Homepage and the on doom9

ffmpeg
ffmpeg (aka libavcodec/format) is surely THE implementation of MPEG-4 when it comes to completeness. ffmpeg is offering as good as all tools imaginable (eg error resilience) and is opensource (LGPL)
still when it comes to encoding it often stands in the shadows of XviD, but it provides good quality and is a very important implementation many other projects are heavily based on (for example the famous or use it)
For more infos about ffmpeg visit the official Homepage and the forum on doom9

3ivx
tough one of the oldest MPEG-4 codecs around (the devs claim even older than DivX5), 3ivx became popular in the last months. 3ivx' video codec offers h.263/MPEG Quants, Adaptive Quant., 4MV (but not B-Frames, GMC and QPEL) and was the first to set a PAR
3ivx does not only offer Video Encoding but is an allround implementation of the MPEG-4 Standard including also AAC encoding (FAAC) and maybe one of the best MP4 container tools available
For more infos about 3ivx visit the official Homepage and the / forum on doom9 (where you can also find some of the devs sometimes)

Nero Digital
the MPEG-4 ASP codec from nero is maybe the youngest one, but nero is very ambitious in becoming very popular
atm their codec is only available inside (together with a good AAC encoders (HE-AAC, Multichannel...)). ND offers only 1 B-Frame, GMC (3 warppoints), QPEL, h.263/MPEG/Custom Quants, Adaptive Quant. (Psy High) and is one of the fastest codecs around
For more infos about NeroDigital visit the official Homepage and the forum on doom9 (where you can also find some of the devs sometimes)


MPEG-4 ASP on Hardware - DivX/NeroDigital Certification / Private MPEG-4 Profiles

Some first generation hardware decoder chips werent able to handle important tools the ASP offers (ie QPEL and GMC). Today's chips are more powerful and support for example QPEL and 1 Warppoint GMC already (none supports 3WP GMC till now)
For being able to support also players, which uses even the oldest chip, DivXNetworks and Nero created something which can be called private MPEG-4 Profiles, namely the DXN Home Theater Profile (DXN HTP) and the ND Standard Profile (ND StP). Every player who is able as a minimum to handle the DXN HTP and ND StP (next to other stuff) can get a Certification from DivXNetworks and/or Nero
When encoding following the HTP/StP for example you cant use QPEL or GMC and use only 1 B-Frame. therefore basically the HTP/StP are a tradeoff between quality and usability with old hardware decoder chips
of course these private certifications also help DivXNetworks and Nero to establish their brand-names even more

still the correct expression for what we need a player to support is MPEG-4 ASP@L5, if a player is offering this you should be able to play your encodes following MPEG-4 ASP (no matter what encoder was used) without problems


further documentation

If you want to read more about MPEG-4 Video you can have a look at the page of , providing an overview of available and
The Specs for the Amendment1 to 14496-2 can be found
Also important to mention is the Site of the , offering a FAQ especially about or an overview of the (including many Infos on MPEG-4 Video too)
also if you simply search for "MPEG-4" on you will find more than a lot of usefull sites too

ISO 14496-3 (Audio) - Advanced Audio Coding (AAC)

The MPEG-4 Standard defines maybe one of best audio formats available at the moment: AAC (Advanced Audio Coding)
AAC is able to include 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 15 low frequency enhancement (LFE, limited to 120 Hz) channels, up to 15 data streams and much more

AAC Profiles

Like with MPEG-4 Video, AAC comes in different Profiles, from which the Low Complexity (LC AAC) Profile (aka MAIN @ Level 2) is the one most widely used in the consumer market (for example in Apple's very popular music store)
other profiles are for example the Long Term Prediction Profile (LTP), Scalable Sampling Rate (SSR) or Low Delay (LD)

of LC AAC with other good formats @ 128kbps (thanks to rjamorim):

note that lame (the available) and vorbis provide in their latest versions much better quality (as you can see )
also note that the wma codec used in this test is wma9 pro, which is a totally different and better quality providing codec than the standard wma9 codec (which is the one used in music stores and cd players) and to which its not backwards compatible


when it comes to low bitrates and multichannel encoding AAC offers the High Efficiency extension (HE AAC), making it one of the best formats in the low bitrate range too:

of HE AAC with other popular formats @ 64kbps (thanks to rjamorim):

note that QT is the LC AAC codec offered in Quicktime, He is the HE AAC codec offered in Nero

when it comes to very low bitrates also the Parametric Stereo extension (PS AAC), which uses HE AAC at the same time too, has to be mentioned (Nero is working on an implementation). How it does compared to other codecs at 32kbps can be seen

available AAC Codecs

AAC codecs are available atm from /, (offers HE AAC), (), , , , (offers HE AAC), , and
rjamorim's quality comparisons:

further documentation

If you want to read more about the AAC Audio Format have a look at the in the forum or at the
A FAQ especially about is available from the
Drafts of the MPEG-4 AAC Specs can be downloaded , here and here
also the offers interesting links regarding AAC (Specs TS 26.401 - TS 26.411) including SBR (HE AAC), source code aso...
ISO 14496-10 (Video) - Advanced Video Coding (AVC) With AVC/H.264 the MPEG-4 Standard defines one of the newest and also one of the technically best available, state-of-the-art Video Coding Formats

The AVC/H.264 Video Coding Standard was together finalized and identically specified in 2003 by two Groups, the MPEG (Moving Pictures Experts Group) from ISO and the VCEG (Video Coding Experts Group) from ITU (International Telecommunication Union), a suborganisation of the United Nations (UNO), which also standardised the H.263 format (mainly used in video conference software now)
The AVC/H.264 Standard itself was developed by the Joint Video Team (JVT), which included experts from both MPEG and VCEG

Looking from the MPEG side the standard is called MPEG-4 Part 10 (ISO 14496-10), looking from the ITU side, it is called H.264 (the ITU document number), by which the format is widely known already
As "official" title for the new standard Advanced Video Coding (AVC) was chosen by MPEG - as video counterpart to the Advanced Audio Coding (AAC) audio format


AVC/H.264 Profiles

AVC/H.264 defines four different Profiles: Baseline, Main, Extended and High Profile (which themselves are also again subdivided into Levels):

- Baseline Profile offers I/P-Frames, supports progressive and CAVLC only
- Main Profile offers I/P/B-Frames, supports progressive and interlaced, and offers CAVLC or CABAC
- Extended Profile offers I/P/B/SP/SI-Frames, supports progressive and CAVLC only
- High Profile (aka FRExt) adds to Main Profile: 8x8 intra prediction, custom quants, lossless video coding, more yuv formats (4:4:4...)

Only the future will tell which Profile and which Tools will be the one most usable for DVD Backups, but i assume it will be the Main and/or High Profile with maybe the following tools (also check out the tool description of MPEG-ASP as all, except GMC, are available in AVC too):

CAVLC/CABAC:
AVC/H.264 defines two, more advanced tools for entropy coding of the bitstream syntax (macroblock-type, motionvectors + reference-index...) than MPEG-4 ASP: Context-Adaptive Variable Length Coding (CAVLC) and Context-Adaptive Binary Arithmetic Coding (CABAC)
CABAC, compared to CAVLC (aka UVLC) which is the default method in AVC/H.264, is a more powerful compression method, being said to bring down the bitrate additonally by about 10-15% (especially on high bitrates). CABAC (as CAVLC) is a lossless method and therefore will never hurt the quality, but will slow down encoding and decoding.

Loop/Deblocking Filter:
in contrary to prefiltering (for example via avisynth, done on the input), or postprocessing/filtering (via the decoder, done on the final output), LoopFiltering is applied during the encoding process on every single frame, after it got encoded, but before it gets used as reference for the following frames. This helps avoiding blocking artifacts, especially on low bitrates, but will slow down encoding and decoding

Variable Block Sizes/Macroblock Partitions:
in contrary to MPEG-4 ASP (where, only with Inter4V/4MV, the Block Sizes can varry between 16x16 and 8x8 pixels), AVC/H.264 offers for Motion Search Precision the division of a macroblock down to 4x4 pixels (including steps like 8x4...). The Block Size is adaptive/variable, a good encoder will be smart enough to decide which one is the most efficient Block Size in every specific macroblock

Multiple Reference Frames:
in contrary to MPEG-4 ASP (which only allows using the frame before the actual frame as reference), AVC/H.264 offers choosing from multiple ones for inter motion search, which means the codec can decide whether he wants to simply refer to the previous frame (like in ASP) or even to a frame before that. Because of that (eg a P-Frame can refer to a frame before the latest I-Frame) a new frametype had to be introduced: IDR-Frames, which are I-Frames before which no following frame is allowed to refer to. Allowing multiple reference frames will slow down encoding and decoding and cutting will be only possible at IDR-Frames

Weighted Prediction:
With Weigthed Prediction there can be weights applied to a reference frame (eg you can scale (brightness-wise) a previous picture). This helps especially whenever there are fades, where the subsequent picture is very similar to the previous one except that it is darker. WP will not help with cross-fades (eg a fade from one scene to another)

Rate Distortion Optimisation (RDO):
RDO allows the encoder to make the most efficient coding decisions whenever it has to choose between different choices (for example when it comes to inter/intra decisions, motion search...)
RDO is not a tool defined by the AVC/H.264 specs, but it's a new decision making approach which was first introduced by the H.264 reference software. Other codecs can also make use of RDO, like XviD's VHQ Mode enables RDO already for example


An overview of AVC/H.264 compared to other popular video coding format has been kindly set up by akupenguin:


available AVC/H.264 Codecs

AVC/H.264 implementations are available atm already from (), , , , , , , , , (), , (reference software) (), ,
(announced codecs: , , )

Encoders

- x264: opensource (GPL) encoder (Source), available as VFW codec: or , as commandline: () and inside the tool (available for Linux, MacOS and BeOS)
x264 supports 2pass, CABAC, Loop, multiple B-Frames, multiple Reference Frames, 4x4 P-Frame Blocksizes, 8x8 B-Frame Blocksizes
- NeroDigital AVC: useable in , outputs .mp4
ND AVC supports 2pass, CABAC, (adaptive) Loop, multiple B-Frames, mulitple Reference Frames, weighted prediction, 8x8 P-Frame Blocksizes, 16x16 B-Frame Blocksizes, Adaptive Quant. (Psy High)
- mpegable: available as free VFW Encoder (not based on the reference), doesnt handle YV12
mpegable supports 1pass (fixed quants) uses P-Frames only, 8x8 P-Frame Blocksizes, CAVLC only, Loop
- MainConcept: available in a free unlimited preview encoder app. (adds a watermark), outputs .mpg (TS output is buggy)
MainConcept supports 1pass (CBR/VBR), P-Frame Reordering, CABAC, Loop, 1 B-Frame, Multiple Ref, 4x4 P-Frame Sizes and RDO
- Sorenson: useable in , outputs .mp4,
Sorenson supports 2pass and B-Frames (seems to use the MainConcept AVC implementation (decoder?))
- Moonlight: useable in Moonlight's and CyberLink's , outputs .mpg
Moonlight supports 1pass (VBR/CBR/Fixed Quants), CABAC, Loop, 2 B-Frames, 8x8 P-Frame Sizes, Adapt. Quant, PAR, Interlacing
- JM: The AVC Reference Software offers in Version 9.3 already Main and High Profile: B/SP-Frames, CABAC, Loop Filter, 4x4 Blocksizes, multiple Reference Frames, Adaptive Quant, Error Resilience, RDO, Lossless Coding, Custom Quants, Rate Control aso...
- Hdot264: opensource (GPL) VFW version of the reference software by doom9 member charact3r, still based on a very old version of the reference (JM 4.0c)
- VSS: free preview VFW Encoder (limited to 5 days), based on the reference encoder
- Envivio: useable in , outputs .mp4

Decoders

- ffmpeg: opensource (LGPL), used already for example in (VFW and DShow decoder), and
ffmpeg supports B-Frames, CABAC, Loop, Weighted Prediction...
- VSS: free preview VFW Decoder (limited to 5 days) and an unlimited
VSS DShow supports .avi (with VSSH and H264 fourcc), CABAC, Loop, B-Frames
- NeroDigital AVC: DShow Decoder and .mp4 Parser coming with Recode2
ND AVC supports B-Frames, CABAC, Loop, Weighted Prediction...
- Moonlight: offers a free DShow AVC decoder (adds a watermark) together with Parsers handling AVC as .mpg, .mp4 and .264
- mpegable: free VFW decoder (usable also in DShow), supports .avi (with DAVC fourcc)
- MainConcept: the preview offers a free DShow AVC decoder (adds a watermark) and a Parser handling AVC as .mpg and .264
- Envivio: not freely available AVC DShow decoder called EnvivioTV, handling AVC in .mp4 (since 2.0, current version: 2-1-181)
- Philips: DShow AVC decoder freely available in the player (handles raw AVC only)
- Pegasus: not really compliant DShow AVC decoder available


Sample content

small MPEG-4 ASP vs. AVC comparison @ 460kbps:

MPEG-4 ASP (XviD 1.0 RC2 - h.263, QPEL, VHQ4, ChromaMotion, Trellis, 2 B-Frames, other settings on default):

current issues with AVC/H.264

If you sniff throught the available AVC implementations you will surely find out soon that there are some issues:

- interoperability: most implementations support different container formats atm:
.mp4: which is the container of AVC defined in the MPEG-4 Standard (ISO 14496-15) and supported by Nero, Sorenson, Envivio and Moonlight atm
.mpg: which is the container of AVC defined in the MPEG-2 Standard (ISO 13818-1, AMD3) and supported by Mainconcept and Moonlight atm (also Blu-ray's BD-ROM will use it)
.avi: using AVC-in-AVI is nowhere standardized and therefore already causes incompatibilies. The , together with the necessary hacks caused by these two formats, hinder the full implementation of all possible features AVC offers (and might even prevent it as some things, like advanced AVC frame coding orders, are simply not possible in AVI and VFW) and therefore harm the possible quality or at least the speed of the development, the interoperability and therefore also the competition. AVI is currently supported by VSS, x264 (both mencoder and vfw) and mpegable
.264/.h264: the raw bitstream as output by the reference encoder for example (x264 in mencoder can output it too, mp4creator can demux from .mp4 to raw too)

- speed: some current encoder implementations are pretty slow (mostly the commercial previews)
still x264 and NeroDigital's AVC encoder seems to offer already a nice speed and quality. But this doesnt change the fact that AVC is a very advanced video coding format and therefore encoding on old CPU's can be very time consuming


MPEG-4 AVC/H.264 on Hardware - HD-DVD/Blu-ray

Two organisations (the DVD Forum and the Blu-ray Disc Association) are currently working on the successor of the popular DVD format, which will support so called High Definition content (larger picture sizes than current DVD): HD-DVD and BD-ROM

As reported the DVD Forum already made the decision that MPEG-4 AVC/H.264 will be used as mandatory video codec for HD-DVD
Also the Blu-ray Disc Association has announced the inclusion of MPEG-4 AVC/H.264 as can be read

It is therefore very likely that AVC/H.264 will be THE upcoming video format, which will be widely used and supported, like it is the case with MPEG-2 (used in DVD) today


further documentation

If you want to read more about the MPEG-4 AVC/H.264 Format have a look for a detailed overview about the Format (also covering the technical side)
For some more summarized infos look or
The AVC Verification Test Results can be found here or (html)
The whole specs of AVC/H.264 can be downloaded (Draft from the 7-14 March 2003)
Technical Info about Blu-ray is available

转贴自 
阅读(2492) | 评论(0) | 转发(0) |
0

上一篇:从SHELL回到VIDEO

下一篇:JRTPLIB和MPEG2

给主人留下些什么吧!~~