Chinaunix首页 | 论坛 | 博客
  • 博客访问: 312177
  • 博文数量: 71
  • 博客积分: 1450
  • 博客等级: 上尉
  • 技术积分: 762
  • 用 户 组: 普通用户
  • 注册时间: 2006-03-14 13:31
文章分类

全部博文(71)

文章存档

2014年(3)

2013年(8)

2011年(9)

2010年(38)

2009年(13)

我的朋友

分类:

2011-03-25 18:44:55

http://blog.chinaunix.net/space.php?uid=287570&do=blog&id=162681

AV同步的一些问题:

先转载一份A'rpi的邮件:
I've "developed" a new a-v sync engine in g2 code, which produces A-V: 0.0000
for most mpeg1/vob streams i have.

The video part is relative easy, but a bit tricky: when a PS packet has a
PTS timestamp, that timestamp belongs to the next complete frame.
(not to the one which ends in that packet!)

The audio is however very tricky.
The old (used in mplayer-g1) audio pts calculation method assumed, that the
timestamps received from the demuxer belongs to the first byte of that packet.
So, after decoded an audio frame/block, it increased PTS by the compressed
frame size divided by compressed byterate.
It is very inaccurate for mpeg. Now i've found why: in mpeg containers, the
audio timestamps behave like the video: they belong to the _next_ complete
frame/block. As AC3 frames are big, they usually go accross multiple
packets, this error may be big.
But fixing this, I got stable A-V but non-zero ct (correction total).
After experimencng with several streams, i've found that ct: value is the
time length of an audio frame. Strange, isn't it?
It means, that the PTS doesn't even belongs to the next audio frame, but to
the next after the next. Or in other words: it's the timestamp for the last
byte/sample of the next frame, instead of the first one:

v-- the PTS belongs to the
PTS end of f2 (or start of f3)
v
frames: [.....f1......][......f2......][.....f3......]
packets: | p1 | p2 | p3 | p4 | p5 |
^PTS^
^- the PTS is coded in p2's header

Using this logic I got <5ms ct: times for all mpeg streams!

But quoting the mpeg-system.pdf:
"
In the T-STD in figure 2-6 on page 11 the display of a video presentation
unit (a picture) occurs instantaneously at its presentation time, tp n (k).

In the T_STD the output of an audio presentation unit starts at its presentation
time, tp n (k), when the decoder instantaneously presents the first sample.
Subsequent samples in the presentation unit are presented in sequence at the audio sampling rate.
"

It suggests that PTS if the beginning of that audio block, not the end!

Do anyone have accurate info about the meaning/calculation of audio PTS for
mpeg container?


A'rpi / Astral & ESP-team

阅读(1207) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~