Chinaunix首页 | 论坛 | 博客
  • 博客访问: 527887
  • 博文数量: 70
  • 博客积分: 3162
  • 博客等级: 中校
  • 技术积分: 850
  • 用 户 组: 普通用户
  • 注册时间: 2006-03-23 13:30
文章分类
文章存档

2013年(1)

2012年(4)

2011年(1)

2010年(7)

2009年(9)

2008年(20)

2007年(3)

2006年(25)

分类:

2006-11-21 14:03:57

    MPEG Audio Frame Header

An MPEG audio file is built up from smaller parts called frames. Generally, frames are independent items. Each frame has its own header and audio informations. There is no file header. Therefore, you can cut any part of MPEG file and play it correctly (this should be done on frame boundaries but most applications will handle incorrect headers). For Layer III, this is not 100% correct. Due to internal data organization in MPEG version 1 Layer III files, frames are often dependent of each other and they cannot be cut off just like that.

When you want to read info about an MPEG file, it is usually enough to find the first frame, read its header and assume that the other frames are the same This may not be always the case. Variable bitrate MPEG files may use so called bitrate switching, which means that bitrate changes according to the content of each frame. This way lower bitrates may be used in frames where it will not reduce sound quality. This allows making better compression while keeping high quality of sound.

The frame header is constituted by the very first four bytes (32bits) in a frame. The first eleven bits (or first twelve bits, see below about frame sync) of a frame header are always set and they are called "frame sync". Therefore, you can search through the file for the first occurence of frame sync (meaning that you have to find a byte with a value of 255, and followed by a byte with its three (or four) most significant bits set). Then you read the whole header and check if the values are correct. You will see in the following table the exact meaning of each bit in the header, and which values may be checked for validity. Each value that is specified as reserved, invalid, bad, or not allowed should indicate an invalid header. Remember, this is not enough, frame sync can be easily (and very frequently) found in any binary file. Also it is likely that MPEG file contains garbage on it's beginning which also may contain false sync. Thus, you have to check two or more frames in a row to assure you are really dealing with MPEG audio file.

Frames may have a CRC check. The CRC is 16 bits long and, if it exists, it follows the frame header. After the CRC comes the audio data. You may calculate the length of the frame and use it if you need to read other headers too or just want to calculate the CRC of the frame, to compare it with the one you read from the file. This is actually a very good method to check the MPEG header validity.

Here is "graphical" presentation of the header content. Characters from A to M are used to indicate different fields. In the table, you can see details about the content of each field.

AAAAAAAA AAABBCCD EEEEFFGH IIJJKLMM

SignLength
(bits)
Position
(bits)
Description
A11(31-21)Frame sync (all bits set)
B2(20,19)MPEG Audio version ID
00 - MPEG Version 2.5
01 - reserved
10 - MPEG Version 2 (ISO/IEC 13818-3)
11 - MPEG Version 1 (ISO/IEC 11172-3)

Note: MPEG Version 2.5 is not official standard. Bit No 20 in frame header is used to indicate version 2.5. Applications that do not support this MPEG version expect this bit always to be set, meaning that frame sync (A) is twelve bits long, not eleve as stated here. Accordingly, B is one bit long (represents only bit No 19). I recommend using methodology presented here, since this allows you to distinguish all three versions and keep full compatibility.

C2(18,17) Layer description
00 - reserved
01 - Layer III
10 - Layer II
11 - Layer I
D1(16) Protection bit
0 - Protected by CRC (16bit crc follows header)
1 - Not protected
E4(15,12)Bitrate index
bitsV1,L1V1,L2V1,L3V2,L1V2, L2 & L3
0000freefreefreefreefree
0001323232328
00106448404816
00119656485624
010012864566432
010116080648040
011019296809648
01112241129611256
100025612811212864
100128816012814480
101032019216016096
1011352224192176112
1100384256224192128
1101416320256224144
1110448384320256160
1111badbadbadbadbad

NOTES: All values are in kbps
V1 - MPEG Version 1
V2 - MPEG Version 2 and Version 2.5
L1 - Layer I
L2 - Layer II
L3 - Layer III
"free" means free format. If the correct fixed bitrate (such files cannot use variable bitrate) is different than those presented in upper table it must be determined by the application. This may be implemented only for internal purposes since third party applications have no means to find out correct bitrate. Howewer, this is not impossible to do but demands lot's of efforts.
"bad" means that this is not an allowed value

MPEG files may have variable bitrate (VBR). This means that bitrate in the file may change. I have learned about two used methods:

  • bitrate switching. Each frame may be created with different bitrate. It may be used in all layers. Layer III decoders must support this method. Layer I & II decoders may support it.
  • bit reservoir. Bitrate may be borrowed (within limits) from previous frames in order to provide more bits to demanding parts of the input signal. This causes, however, that the frames are no longer independent, which means you should not cut this files. This is supported only in Layer III.

    More about VBR you may find on

    For Layer II there are some combinations of bitrate and mode which are not allowed. Here is a list of allowed combinations.

    bitrate allowed modes
    free all
    32 single channel
    48 single channel
    56 single channel
    64 all
    80 single channel
    96 all
    112 all
    128 all
    160 all
    192 all
    224 stereo, intensity stereo, dual channel
    256 stereo, intensity stereo, dual channel
    320 stereo, intensity stereo, dual channel
    384 stereo, intensity stereo, dual channel

  • F2(11,10) Sampling rate frequency index (values are in Hz)
    bitsMPEG1MPEG2MPEG2.5
    00441002205011025
    01480002400012000
    1032000160008000
    11reserv.reserv.reserv.
    G1(9) Padding bit
    0 - frame is not padded
    1 - frame is padded with one extra slot
    Padding is used to fit the bit rates exactly. For an example: 128k 44.1kHz layer II uses a lot of 418 bytes and some of 417 bytes long frames to get the exact 128k bitrate. For Layer I slot is 32 bits long, for Layer II and Layer III slot is 8 bits long.

    How to calculate frame length

    First, let's distinguish two terms frame size and frame length. Frame size is the number of samples contained in a frame. It is constant and always 384 samples for Layer I and 1152 samples for Layer II and Layer III. Frame length is length of a frame when compressed. It is calculated in slots. One slot is 4 bytes long for Layer I, and one byte long for Layer II and Layer III. When you are reading MPEG file you must calculate this to be able to find each consecutive frame. Remember, frame length may change from frame to frame due to padding or bitrate switching.

    Read the BitRate, SampleRate and Padding of the frame header.

    For Layer I files us this formula:

    FrameLengthInBytes = (12 * BitRate / SampleRate + Padding) * 4

    For Layer II & III files use this formula:

    FrameLengthInBytes = 144 * BitRate / SampleRate + Padding

    Example:
    Layer III, BitRate=128000, SampleRate=441000, Padding=0
          ==>  FrameSize=417 bytes

    H1(8) Private bit. It may be freely used for specific needs of an application, i.e. if it has to trigger some application specific events.
    I2(7,6) Channel Mode
    00 - Stereo
    01 - Joint stereo (Stereo)
    10 - Dual channel (Stereo)
    11 - Single channel (Mono)
    J2(5,4) Mode extension (Only if Joint stereo)

    Mode extension is used to join informations that are of no use for stereo effect, thus reducing needed resources. These bits are dynamically determined by an encoder in Joint stereo mode.

    Complete frequency range of MPEG file is divided in subbands There are 32 subbands. For Layer I & II these two bits determine frequency range (bands) where intensity stereo is applied. For Layer III these two bits determine which type of joint stereo is used (intensity stereo or m/s stereo). Frequency range is determined within decompression algorythm.

    Layer I and IILayer III
    valueLayer I & II
    00bands 4 to 31
    01bands 8 to 31
    10bands 12 to 31
    11bands 16 to 31
    Intensity stereoMS stereo
    offoff
    onoff
    offon
    onon

    K1(3) Copyright
    0 - Audio is not copyrighted
    1 - Audio is copyrighted
    L1(2) Original
    0 - Copy of original media
    1 - Original media
    M2(1,0) Emphasis
    00 - none
    01 - 50/15 ms
    10 - reserved
    11 - CCIT J.17

     

    MPEG Audio Tag ID3v1

    The TAG is used to describe the MPEG Audio file. It contains information about artist, title, album, publishing year and genre. There is some extra space for comments. It is exactly 128 bytes long and is located at very end of the audio data. You can get it by reading the last 128 bytes of the MPEG audio file.

    AAABBBBB BBBBBBBB BBBBBBBB BBBBBBBB
    BCCCCCCC CCCCCCCC CCCCCCCC CCCCCCCD
    DDDDDDDD DDDDDDDD DDDDDDDD DDDDDEEE
    EFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFG

    SignLength
    (bytes)
    Position
    (bytes)
    Description
    A3(0-2) Tag identification. Must contain 'TAG' if tag exists and is correct.
    B30(3-32)Title
    C30(33-62)Artist
    D30(63-92)Album
    E4(93-96)Year
    F30(97-126)Comment
    G1(127)Genre

    The specification asks for all fields to be padded with null character (ASCII 0). However, not all applications respect this (an example is WinAmp which pads fields with , ASCII 32).

    There is a small change proposed in ID3v1.1 structure. The last byte of the Comment field may be used to specify the track number of a song in an album. It should contain a null character (ASCII 0) if the information is unknown.

    Genre is a numeric field which may have one of the following values:

    0 'Blues' 20 'Alternative' 40 'AlternRock' 60 'Top 40'
    1 'Classic Rock' 21 'Ska' 41 'Bass' 61 'Christian Rap'
    2 'Country' 22 'Death Metal' 42 'Soul' 62 'Pop/Funk'
    3 'Dance' 23 'Pranks' 43 'Punk' 63 'Jungle'
    4 'Disco' 24 'Soundtrack' 44 'Space' 64 'Native American'
    5 'Funk' 25 'Euro-Techno' 45 'Meditative' 65 'Cabaret'
    6 'Grunge' 26 'Ambient' 46 'Instrumental Pop' 66 'New Wave'
    7 'Hip-Hop' 27 'Trip-Hop' 47 'Instrumental Rock' 67 'Psychadelic'
    8 'Jazz' 28 'Vocal' 48 'Ethnic' 68 'Rave'
    9 'Metal' 29 'Jazz+Funk' 49 'Gothic' 69 'Showtunes'
    10 'New Age' 30 'Fusion' 50 'Darkwave' 70 'Trailer'
    11 'Oldies' 31 'Trance' 51 'Techno-Industrial' 71 'Lo-Fi'
    12 'Other' 32 'Classical' 52 'Electronic' 72 'Tribal'
    13 'Pop' 33 'Instrumental' 53 'Pop-Folk' 73 'Acid Punk'
    14 'R&B' 34 'Acid' 54 'Eurodance' 74 'Acid Jazz'
    15 'Rap' 35 'House' 55 'Dream' 75 'Polka'
    16 'Reggae' 36 'Game' 56 'Southern Rock' 76 'Retro'
    17 'Rock' 37 'Sound Clip' 57 'Comedy' 77 'Musical'
    18 'Techno' 38 'Gospel' 58 'Cult' 78 'Rock & Roll'
    19 'Industrial' 39 'Noise' 59 'Gangsta' 79 'Hard Rock'

    WinAmp expanded this table with next codes:
    80 'Folk' 92 'Progressive Rock' 104 'Chamber Music' 116 'Ballad'
    81 'Folk-Rock' 93 'Psychedelic Rock' 105 'Sonata' 117 'Poweer Ballad'
    82 'National Folk' 94 'Symphonic Rock' 106 'Symphony' 118 'Rhytmic Soul'
    83 'Swing' 95 'Slow Rock' 107 'Booty Brass' 119 'Freestyle'
    84 'Fast Fusion' 96 'Big Band' 108 'Primus' 120 'Duet'
    85 'Bebob' 97 'Chorus' 109 'Porn Groove' 121 'Punk Rock'
    86 'Latin' 98 'Easy Listening' 110 'Satire' 122 'Drum Solo'
    87 'Revival' 99 'Acoustic' 111 'Slow Jam' 123 'A Capela'
    88 'Celtic' 100 'Humour' 112 'Club' 124 'Euro-House'
    89 'Bluegrass' 101 'Speech' 113 'Tango' 125 'Dance Hall'
    90 'Avantgarde' 102 'Chanson' 114 'Samba'    
    91 'Gothic Rock' 103 'Opera' 115 'Folklore'    
    Any other value should be considered as 'Unknown'

     

    MPEG Audio Tag ID3v2

    This is new proposed TAG format which is different than ID3v1 and ID3v1.1. Complete tech specs for it may be found at .

    阅读(2660) | 评论(0) | 转发(0) |
    给主人留下些什么吧!~~