Chinaunix首页 | 论坛 | 博客
  • 博客访问: 675206
  • 博文数量: 160
  • 博客积分: 2384
  • 博客等级: 大尉
  • 技术积分: 1366
  • 用 户 组: 普通用户
  • 注册时间: 2008-12-01 11:35
文章分类
文章存档

2015年(45)

2014年(36)

2012年(28)

2011年(37)

2010年(2)

2009年(10)

2008年(2)

分类: 其他平台

2015-05-29 00:35:15

有了上文我们经知道了MKV文件时长、音视频的类型、分辨率、采样率等基本信息,接下来就是音视频的数据了。

4.Clusters

所有的音视频帧数据都在这部分内装着。
1个Cluster内可能有很多个BlockGroup组成,BlockGroup内又由若干个Block组成。这些Block内就是音视频的帧数据。
1个Cluster并不一定只是音频或者视频。它是由不同的音视频BlockGroup交叉组成。因为多媒体文件中的音视频数据本来就是交叉出现的。
Clusters
Cluster
Timecode  
BlockGroup Block
BlockGroup
Block
ReferenceBlock
BlockGroup Block
Cluster
Timecode  
BlockGroup Block
BlockGroup Block
BlockGroup Block
BlockGroup
Block
BlockDuration
Element Name L EBML ID Ma Mu Rng Default T 1 2 3 W Description
 
 
Cluster
Cluster 1 [1F][43][B6][75] - * - - m * * * * The lower level element containing the (monolithic) Block structure.
Timecode 2 [E7] * - - - u * * * * Absolute timecode of the cluster (based on TimecodeScale).
SilentTracks 2 [58][54] - - - - m * * *   The list of tracks that are not used in that part of the stream. It is useful when using overlay tracks on seeking. Then you should decide what track to use.
SilentTrackNumber 3 [58][D7] - * - - u * * *   One of the track number that are not used from now on in the stream. It could change later if not specified as silent in a further Cluster.
Position 2 [A7] - - - - u * * *   The  of the Cluster in the segment (0 in live broadcast streams). It might help to resynchronise offset on damaged streams.
PrevSize 2 [AB] - - - - u * * * * Size of the previous Cluster, in octets. Can be useful for backward playing.
SimpleBlock 2 [A3] - * - - b   * * * Similar to  but without all the extra information, mostly used to reduced overhead when no extra feature is needed. (see )
BlockGroup 2 [A0] - * - - m * * * * Basic container of information containing a single Block or BlockVirtual, and information specific to that Block/VirtualBlock.
Block 3 [A1] * - - - b * * * * Block containing the actual data to be rendered and a timecode relative to the Cluster Timecode. (see )
BlockVirtual 3 [A2] - - - - b         A Block with no data. It must be stored in the stream at the place the real Block should be in display order. (see )
BlockAdditions 3 [75][A1] - - - - m * * *   Contain additional blocks to complete the main one. An EBML parser that has no knowledge of the Block structure could still see and use/skip these data.
BlockMore 4 [A6] * * - - m * * *   Contain the BlockAdditional and some parameters.
BlockAddID 5 [EE] * - not 0 1 u * * *   An ID to identify the BlockAdditional level.
BlockAdditional 5 [A5] * - - - b * * *   Interpreted by the codec as it wishes (using the BlockAddID).
BlockDuration 3 [9B] - - - TrackDuration u * * * * The duration of the Block (based on TimecodeScale). This element is mandatory when DefaultDuration is set for the track (but can be omitted as other default values). When not written and with no DefaultDuration, the value is assumed to be the difference between the timecode of this Block and the timecode of the next Block in "display" order (not coding order). This element can be useful at the end of a Track (as there is not other Block available), or when there is a break in a track like for subtitle tracks. When set to 0 that means the frame is not a keyframe.
ReferencePriority 3 [FA] * - - 0 u * * *   This frame is referenced and has the specified cache priority. In cache only a frame of the same or higher priority can replace this frame. A value of 0 means the frame is not referenced.
ReferenceBlock 3 [FB] - * - - i * * * * Timecode of another frame used as a reference (ie: B or P frame). The timecode is relative to the block it's attached to.
ReferenceVirtual 3 [FD] - - - - i         Relative  of the data that should be in position of the virtual block.
CodecState 3 [A4] - - - - b   * *   The new codec state to use. Data interpretation is private to the codec. This information should always be referenced by a seek entry.
Slices 3 [8E] - - - - m * * * * Contains slices description.
TimeSlice 4 [E8] - * - - m * * * * Contains extra time information about the data contained in the Block. While there are a few files in the wild with this element, it is no longer in use and has been deprecated. Being able to interpret this element is not required for playback.
LaceNumber 5 [CC] - - - 0 u * * * * The reverse number of the frame in the lace (0 is the last frame, 1 is the next to last, etc). While there are a few files in the wild with this element, it is no longer in use and has been deprecated. Being able to interpret this element is not required for playback.
FrameNumber 5 [CD] - - - 0 u         The number of the frame to generate from this lace with this delay (allow you to generate many frames from the same Block/Frame).
BlockAdditionID 5 [CB] - - - 0 u         The ID of the BlockAdditional element (0 is the main Block).
Delay 5 [CE] - - - 0 u         The (scaled) delay to apply to the element.
SliceDuration 5 [CF] - - - 0 u         The (scaled) duration to apply to the element.
ReferenceFrame 3 [C8] - - - - m         DivX trick track extenstions
ReferenceOffset 4 [C9] * - - - u         DivX trick track extenstions
ReferenceTimeCode 4 [CA] * - - - u         DivX trick track extenstions
EncryptedBlock 2 [AF] - * - - b         Similar to  but the data inside the Block are Transformed (encrypt and/or signed). (see )

还有用之前的例子 

Cluster  ID =  [1F][43][B6][75]
size = 0x12468f (1197711)
剩下的1197711的数据就是这个Cluster 的data
第一个EBML元素 是Timecode  ID = E7 size = 1 值为0 (红框内)
第二个元素ID = A0 查表可知这个EBML元素就是BlockGroup  size = 96042 
紧接着就是ID = A1 第三级EBML元素 Block  size = 96038 
Block 结构如下图
Block Header
Offset Player Description
0x00+ must Track Number (Track Entry). It is coded in EBML like form (1 octet if the value is < 0x80, 2 if < 0x4000, etc) (most significant bits set to increase the range).
0x01+ must Timecode (relative to Cluster timecode, signed int16)
0x03+ -
Flags
Bit Player Description
0-3 - Reserved, set to 0
4 - Invisible, the codec should decode this frame but not display it
5-6 must Lacing
  • 00 : no lacing
  • 01 : Xiph lacing
  • 11 : EBML lacing
  • 10 : fixed-size lacing
7 - not used
Lace (when lacing bit is set)
0x00 must Number of frames in the lace-1 (uint8)
0x01 / 0xXX must* Lace-coded size of each frame of the lace, except for the last one (multiple uint8). *This is not used with Fixed-size lacing as it is calculated automatically from (total size of lace) / (number of frames in lace).
(possibly) Laced Data
0x00 must Consecutive laced frames

第1字节 表示Track Number
第2-3字节表示Timecode
第4字节表示 flags 
看上面的例子,
Block data 第1个字节 0x81  按照EBML解释方式 Track Number = 1,结合上文得知 这个Block 数据是track 1的数据。track 1对应的是video数据,解码器类型是H.264.也就是这个block 的数据是264帧数据
Timecode 为 0000
flags = 0 

Lace是根据 flags 的值来确定的。上面这个flags 5-6位都是0 所有是no lacing。剩下的96038 - 4 都是视频的帧数据。

将这个96034长度block 的数据转成NALU格式,然后加上从track部分中的CodecPrivate数据中解析出来的sps 和 pps 信息 保存到本地,应该就是1帧的264数据

用elecard 打开 果然是1帧I帧数据


按照这个套路,看看下一个


BlockGroup ID = A0 size = 0x2808

第一个block  ID = A1 size = 0x2805(10245)

第1字节 表示Track Number   =2  表示是track 2的数据,track是ac3 的音频。

第2-3字节表示Timecode = 0x0005;

第4字节表示 flags = 0x04
这个时候就要解析 Lace   flags 第5-6位为10 所有属于fixed-size lacing

Fixed-size lacing 是如下的结构

Fixed-size lacing

In this case only the number of frames in the lace is saved, the size of each frame is deduced from the total size of the Block. For example, for 3 frames of 800 octets each :

  • Block head (with lacing bits set to 10)
  • Lacing head: Number of frames in the lace -1, i.e. 2
  • Data in frame 1
  • Data in frame 2
  • Data in frame 3
07+1 就是包含的帧数,因为是ac3的音频 可以看到 07后面紧接着就是ac3的同步头0x0b77(绿框)
用工具看看相对应的解析结果


按照这种逻辑和方法,我们就可以把mkv文件中的音视频数据流demux出来了。


5.Cueing Data

Cueing Data 这部分内容其实是关键帧的index,如果没有关键帧的index的话,在做seek、快进快退的时候是十分困难的。你要逐个包去找。之前说过flv文件中官方没有做I帧index的规定。但是在民间已经做了补充。mkv官方有对index的规范。那就是Cueing Data 

下面是结构图。


Cueing Data
Cues
CuePoint
CueTime
CuePosition
CuePoint
CueTime
CuePosition
Cueing Data
Cues 1 [1C][53][BB][6B] - - - - m * * * * A top-level element to speed seeking access. All entries are local to the segment. Should be mandatory for non .
CuePoint 2 [BB] * * - - m * * * * Contains all information relative to a seek point in the segment.
CueTime 3 [B3] * - - - u * * * * Absolute timecode according to the segment time base.
CueTrackPositions 3 [B7] * * - - m * * * * Contain positions for different tracks corresponding to the timecode.
CueTrack 4 [F7] * - not 0 - u * * * * The track for which a position is given.
CueClusterPosition 4 [F1] * - - - u * * * * The  of the Cluster containing the required Block.
CueRelativePosition 4 [F0] - - - - u         The relative position of the referenced block inside the cluster with 0 being the first possible position for an element inside that cluster.
CueDuration 4 [B2] - - - - u         The duration of the block according to the segment time base. If missing the track's DefaultDuration does not apply and no duration information is available in terms of the cues.
CueBlockNumber 4 [53][78] - - not 0 1 u * * * * Number of the Block in the specified Cluster.
CueCodecState 4 [EA] - - - 0 u   * *   The  of the Codec State corresponding to this Cue element. 0 means that the data is taken from the initial Track Entry.
CueReference 4 [DB] - * - - m   * *   The Clusters containing the required referenced Blocks.
CueRefTime 5 [96] * - - - u   * *   Timecode of the referenced Block.


继续看上面的例子,我们找到了Cues  所在的位置。

ID = [1C][53][BB][6B] 表示Cues size = 0x7f

紧接着每个ID = 0xBB 就是一个CuePoint,图中的绿色框中的就是一个。 size = 0xC

CueTime ID = 0xB3  size = 1 data = 0;

CueTrackPositions  ID=0xB7 size=7 data=0xf78101f18215ef

       CueTrack ID=F7 size = 1 data = 1 表示这个位置的track num 值为1  针对这个流应该是video

       CueClusterPosition ID = F1 size = 2 data = 15ef   位置是在0x15ef(5615) 相对于Segment 

找到这个位置发现是第一个Clusters 上面章节分析了,这个族的video内容正好是关键帧。

按照这种方式 发现这个文件中共有8个cuepoint 信息


把这个文件中的264视频demux出来,用工具查看发现关键帧正好也是8个。

6.小结

已经把MKV主要部分的内容作了一次详细的叙述,现在对mkv文件做个小结。
1.MKV的基本组成单元都是EBML格式。每个元素都有级别。一级一级的包括组成了mkv不同的部分。
Level 0
Grouping
Level 1
Level 2 Level 3

2.MKV是由EBML header 和Segment 2大部分组成。Segment中又分Meta Seek InformationSegment InformationTrackChaptersClustersCueing DataAttachmentTagging
3.EBML header 部分包含着MKV可辨识性的信息。
4.Meta Seek Information包含其实部分位置信息。
5.Segment Information 包含识别文件的信息,包括 Title 、 SegmentUID,有个比较关心的文件时常信息Duration也在这一部分
6.Track包含了音视频的基本信息,如音视频解码器类型、视频分辨率、音频采样率等。
7.真实的音视频数据信息交叉装在Clusters
8.Cueing Data 关键帧index,对seek至关重要。
阅读(4447) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~