基于libmad库的MP3解码简析-nanye1984-ChinaUnix博客

狂犇的阿牛哥alvin.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

nanye1984

博客访问： 1179324
博文数量： 646
博客积分： 288
博客等级：二等列兵
技术积分： 5375
用户组：普通用户
注册时间： 2010-07-08 14:33

个人简介

为了技术，我不会停下学习的脚步，我相信我还能走二十年。

文章分类

全部博文（646）

感悟（5）

分享（0）

借鉴（1）
软--C/C++（62）

C（57）

C++（2）

STL（0）

Boost（0）
软--Assembly（3）

Linux ASM（3）
软--Algorithm（55）

方法（5）

算法（14）
软--Kernel（142）

基础（70）

文件（16）

网络（9）

驱动（39）
软--APUE（100）

基础（28）

进程（10）

线程（4）

信号（1）

文件IO（8）

高级IO（7）

终端IO（2）
软--UNP（55）

Socket（27）

IPC（0）

消息队列（5）

同步（8）

共享内存（7）

远程调用（0）
软--Shell（91）

命令（0）

Sed（0）

Awk（0）
软--Tool（48）

开发（0）

调试（0）

编译（19）

服务（0）

库（7）
软--架构（7）

服务器（2）

开发板（0）
软--数据库（60）

Sqlite（0）

MySQL（1）
硬--PCB板（0）

Cadence（0）

AltiumDesigner（0）

手动焊接（0）
硬--单片机（0）

51&52（0）

AVR（0）

PIC（0）

元件（0）
硬--ARM（17）

s3c2440（2）

s3c6410（0）
未分配的博文（1）

文章存档

2014年（8）

2013年（134）

2012年（504）

我的朋友

最近访客

推荐博文

基于libmad库的MP3解码简析

分类：

2012-07-01 19:50:27

原文地址：基于libmad库的MP3解码简析作者：JGFNTU

MAD （libmad）是一个开源的高精度 MPEG 音频解码库，支持 MPEG-1（Layer I, Layer II 和 LayerIII（也就是 MP3）。LIBMAD 提供 24-bit 的 PCM 输出，完全是定点计算，非常适合没有浮点支持的平台上使用。使用 libmad 提供的一系列 API，就可以非常简单地实现 MP3 数据解码工作。在 libmad 的源代码文件目录下的 mad.h 文件中，可以看到绝大部分该库的数据结构和 API 等。

网上有很多关于libmad的使用实例，在他们的基础上，我稍加总结、整理和衍生，文末给出相关参考链接，表示感谢！

一、libmad库源码

可以去相关网站下载，给出链接：

可以根据不同的平台自行编译或者移植，略述。

二、相关数据结构及函数接口简介

1、struct mad_decode

struct mad_decoder {
enum mad_decoder_mode mode;
int options;
struct {
long pid;
int in;
int out;
} async;
struct {
struct mad_stream stream;
struct mad_frame frame;
struct mad_synth synth;
} *sync;
void *cb_data;
enum mad_flow (*input_func)(void *, struct mad_stream *);
enum mad_flow (*header_func)(void *, struct mad_header const *);
enum mad_flow (*filter_func)(void *,
struct mad_stream const *, struct mad_frame *);
enum mad_flow (*output_func)(void *,
struct mad_header const *, struct mad_pcm *);
enum mad_flow (*error_func)(void *, struct mad_stream *, struct mad_frame *);
enum mad_flow (*message_func)(void *, void *, unsigned int *);
};

2、struct mad_stream

struct mad_stream {
unsigned char const *buffer; /* input bitstream buffer */
unsigned char const *bufend; /* end of buffer */
unsigned long skiplen; /* bytes to skip before next frame */
int sync; /* stream sync found */
unsigned long freerate; /* free bitrate (fixed) */
unsigned char const *this_frame; /* start of current frame */
unsigned char const *next_frame; /* start of next frame */
struct mad_bitptr ptr; /* current processing bit pointer */
struct mad_bitptr anc_ptr; /* ancillary bits pointer */
unsigned int anc_bitlen; /* number of ancillary bits */
unsigned char (*main_data)[MAD_BUFFER_MDLEN];
/* Layer III main_data() */
unsigned int md_len; /* bytes in main_data */
int options; /* decoding options (see below) */
enum mad_error error; /* error code (see above) */
};

三、MP3解码流程简介

MP3解码有同步方式和异步方式两种，libmad是以桢为单位对MP3进行解码的，所谓同步方式是指解码函数在解码完一帧后才返回并带回出错信息，异步方式是指解码函数在调用后立即返回，通过消息传递解码状态信息。

1、首先创建一个解码器 struct mad_decoder decoder，紧接着调用函数 mad_decoder_init(...)函数，给出这个函数的原型及定义

/*
* NAME: decoder->init()
* DESCRIPTION: initialize a decoder object with callback routines
*/
void mad_decoder_init(struct mad_decoder *decoder, void *data,
enum mad_flow (*input_func)(void *,
struct mad_stream *),
enum mad_flow (*header_func)(void *,
struct mad_header const *),
enum mad_flow (*filter_func)(void *,
struct mad_stream const *,
struct mad_frame *),
enum mad_flow (*output_func)(void *,
struct mad_header const *,
struct mad_pcm *),
enum mad_flow (*error_func)(void *,
struct mad_stream *,
struct mad_frame *),
enum mad_flow (*message_func)(void *,
void *, unsigned int *))
{
decoder->mode = -1;
decoder->options = 0;
decoder->async.pid = 0;
decoder->async.in = -1;
decoder->async.out = -1;
decoder->sync = 0;
decoder->cb_data = data;
decoder->input_func = input_func;
decoder->header_func = header_func;
decoder->filter_func = filter_func;
decoder->output_func = output_func;
decoder->error_func = error_func;
decoder->message_func = message_func;
}

用户编程可以用如下方式调用，可以看到从第三个参数开始，其实都是一些列的函数指针，这里初始化的目的其实是给创建的decoder注册下面即将要自己实现的这些函数。Libmad库会在解码过程中回调这些函数：

mad_decoder_init(&decoder, &buffer,
input, 0 /* header */, 0 /* filter */, output,
error, 0 /* message */);

第一个参数，就是定义的解码器decoder；

第二个参数，是一个void型的函数指针，这里也就是给你的用户空间定义私有的数据结构体用的，下面会给出具体的例子来说明其用法；

第三个参数，input_func函数，这个是用来读取你的mp3资源的函数；

第四个参数，header_func函数，这个顾名思义是处理mp3头部信息的函数，可以根据需要取舍；

第五个参数，filter_func函数，也没有深入理解过，可以不必实现；
第六个参数，output_func函数，这个是用来将解码之后的数据写入输出缓冲区或者音频设备节点的；

第七个参数，error_func函数，是用来打印返回的解码出错信息的；

第八个参数，message_func可以不必实现。

2、调用mad_decoder_run(&decoder, MAD_DECODER_MODE_SYNC)函数启动解码，查看Libmad库源码可知，这个函数里面会注册一个函数指针

/*
* NAME: decoder->run()
* DESCRIPTION: run the decoder thread either synchronously or asynchronously
*/
int mad_decoder_run(struct mad_decoder *decoder, enum mad_decoder_mode mode)
{
int result;
int (*run)(struct mad_decoder *) = 0;
switch (decoder->mode = mode) {
case MAD_DECODER_MODE_SYNC:
run = run_sync;
break;
case MAD_DECODER_MODE_ASYNC:
# if defined(USE_ASYNC)
run = run_async;
# endif
break;
}
if (run == 0)
return -1;
decoder->sync = malloc(sizeof(*decoder->sync));
if (decoder->sync == 0)
return -1;
result = run(decoder);
free(decoder->sync);
decoder->sync = 0;
return result;
}

而在这个run_sync(struct mad_decoder *decoder)函数中则有一个大的while循环来依次调用

decoder->input_func(decoder->cb_data, stream)获取mp3源文件，然后交由相关库函数解码。

而后会有decoder->output_func(decoder->cb_data, &frame->header, &synth->pcm)函数来输出解码后的数据。

3、最后调用mad_decoder_finish(&decoder)结束解码，释放decoder资源。

4、在input_func函数中，会调用一个很重要的函数

mad_stream_buffer(stream, buffer->start, buffer->length) ，第一个参数指向一个mad_stream变量，mad_stream结构定义在stream.h头文件里，用于记录文件的地址和当前处理的位置。第二、三个参数分别是mp3文件在内存中映像的起始地址和文件长度。mad_stream_buffer()函数将mp3文件与mad_stream结构进行关联。

四、MP3解码编程实例

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/soundcard.h>
#include "mad.h"
#define BUFSIZE 8192
/*
* This is a private message structure. A generic pointer to this structure
* is passed to each of the callback functions. Put here any data you need
* to access from within the callbacks.
*/
struct buffer {
FILE *fp; /*file pointer*/
unsigned int flen; /*file length*/
unsigned int fpos; /*current position*/
unsigned char fbuf[BUFSIZE]; /*buffer*/
unsigned int fbsize; /*indeed size of buffer*/
};
typedef struct buffer mp3_file;
int soundfd; /*soundcard file*/
unsigned int prerate = 0; /*the pre simple rate*/
int writedsp(int c)
{
return write(soundfd, (char *)&c, 1);
}
void set_dsp()
{
#if 0
int format = AFMT_S16_LE;
int channels = 2;
int rate = 44100;
soundfd = open("/dev/dsp", O_WRONLY);
ioctl(soundfd, SNDCTL_DSP_SPEED,&rate);
ioctl(soundfd, SNDCTL_DSP_SETFMT, &format);
ioctl(soundfd, SNDCTL_DSP_CHANNELS, &channels);
#else
if((soundfd = open("test.bin" , O_WRONLY | O_CREAT)) < 0)
{
fprintf(stderr , "can't open sound device!\n");
exit(-1);
}
#endif
}
/*
* This is perhaps the simplest example use of the MAD high-level API.
* Standard input is mapped into memory via mmap(), then the high-level API
* is invoked with three callbacks: input, output, and error. The output
* callback converts MAD's high-resolution PCM samples to 16 bits, then
* writes them to standard output in little-endian, stereo-interleaved
* format.
*/
static int decode(mp3_file *mp3fp);
int main(int argc, char *argv[])
{
long flen, fsta, fend;
int dlen;
mp3_file *mp3fp;
if (argc != 2)
return 1;
mp3fp = (mp3_file *)malloc(sizeof(mp3_file));
if((mp3fp->fp = fopen(argv[1], "r")) == NULL)
{
printf("can't open source file.\n");
return 2;
}
fsta = ftell(mp3fp->fp);
fseek(mp3fp->fp, 0, SEEK_END);
fend = ftell(mp3fp->fp);
flen = fend - fsta;
if(flen > 0)
fseek(mp3fp->fp, 0, SEEK_SET);
fread(mp3fp->fbuf, 1, BUFSIZE, mp3fp->fp);
mp3fp->fbsize = BUFSIZE;
mp3fp->fpos = BUFSIZE;
mp3fp->flen = flen;
set_dsp();
decode(mp3fp);
close(soundfd);
fclose(mp3fp->fp);
return 0;
}
static enum mad_flow input(void *data, struct mad_stream *stream)
{
mp3_file *mp3fp;
int ret_code;
int unproc_data_size; /*the unprocessed data's size*/
int copy_size;
mp3fp = (mp3_file *)data;
if(mp3fp->fpos < mp3fp->flen) {
unproc_data_size = stream->bufend - stream->next_frame;
//printf("%d, %d, %d\n", unproc_data_size, mp3fp->fpos, mp3fp->fbsize);
memcpy(mp3fp->fbuf, mp3fp->fbuf + mp3fp->fbsize - unproc_data_size, unproc_data_size);
copy_size = BUFSIZE - unproc_data_size;
if(mp3fp->fpos + copy_size > mp3fp->flen) {
copy_size = mp3fp->flen - mp3fp->fpos;
}
fread(mp3fp->fbuf+unproc_data_size, 1, copy_size, mp3fp->fp);
mp3fp->fbsize = unproc_data_size + copy_size;
mp3fp->fpos += copy_size;
/*Hand off the buffer to the mp3 input stream*/
mad_stream_buffer(stream, mp3fp->fbuf, mp3fp->fbsize);
ret_code = MAD_FLOW_CONTINUE;
} else {
ret_code = MAD_FLOW_STOP;
}
return ret_code;
}
/*
* The following utility routine performs simple rounding, clipping, and
* scaling of MAD's high-resolution samples down to 16 bits. It does not
* perform any dithering or noise shaping, which would be recommended to
* obtain any exceptional audio quality. It is therefore not recommended to
* use this routine if high-quality output is desired.
*/
static inline signed int scale(mad_fixed_t sample)
{
/* round */
sample += (1L << (MAD_F_FRACBITS - 16));
/* clip */
if (sample >= MAD_F_ONE)
sample = MAD_F_ONE - 1;
else if (sample < -MAD_F_ONE)
sample = -MAD_F_ONE;
/* quantize */
return sample >> (MAD_F_FRACBITS + 1 - 16);
}
/*
* This is the output callback function. It is called after each frame of
* MPEG audio data has been completely decoded. The purpose of this callback
* is to output (or play) the decoded PCM audio.
*/
//输出函数做相应的修改，目的是解决播放音乐时声音卡的问题。
static enum mad_flow output(void *data, struct mad_header const *header,
struct mad_pcm *pcm)
{
unsigned int nchannels, nsamples;
mad_fixed_t const *left_ch, *right_ch;
// pcm->samplerate contains the sampling frequency
nchannels = pcm->channels;
nsamples = pcm->length;
left_ch = pcm->samples[0];
right_ch = pcm->samples[1];
short buf[nsamples *2];
int i = 0;
//printf(">>%d\n", nsamples);
while (nsamples--) {
signed int sample;
// output sample(s) in 16-bit signed little-endian PCM
sample = scale(*left_ch++);
buf[i++] = sample & 0xFFFF;
if (nchannels == 2) {
sample = scale(*right_ch++);
buf[i++] = sample & 0xFFFF;
}
}
//fprintf(stderr, ".");
write(soundfd, &buf[0], i * 2);
return MAD_FLOW_CONTINUE;
}
/*
* This is the error callback function. It is called whenever a decoding
* error occurs. The error is indicated by stream->error; the list of
* possible MAD_ERROR_* errors can be found in the mad.h (or stream.h)
* header file.
*/
static enum mad_flow error(void *data,
struct mad_stream *stream,
struct mad_frame *frame)
{
mp3_file *mp3fp = data;
fprintf(stderr, "decoding error 0x%04x (%s) at byte offset %u\n",
stream->error, mad_stream_errorstr(stream),
stream->this_frame - mp3fp->fbuf);
/* return MAD_FLOW_BREAK here to stop decoding (and propagate an error) */
return MAD_FLOW_CONTINUE;
}
/*
* This is the function called by main() above to perform all the decoding.
* It instantiates a decoder object and configures it with the input,
* output, and error callback functions above. A single call to
* mad_decoder_run() continues until a callback function returns
* MAD_FLOW_STOP (to stop decoding) or MAD_FLOW_BREAK (to stop decoding and
* signal an error).
*/
static int decode(mp3_file *mp3fp)
{
struct mad_decoder decoder;
int result;
/* configure input, output, and error functions */
mad_decoder_init(&decoder, mp3fp,
input, 0 /* header */, 0 /* filter */, output,
error, 0 /* message */);
/* start decoding */
result = mad_decoder_run(&decoder, MAD_DECODER_MODE_SYNC);
/* release the decoder */
mad_decoder_finish(&decoder);
return result;
}

说明：1、实例原本是基于音频OSS框架的，当然，在嵌入式领域，ALSA也是兼容OSS接口的；

2、为了在ubuntu上调试方便，并没有直接往音频接口，而是创建了一个文件，直接往文件里面写；

3、上述代码中的红色区域重点讲解一下

static enum mad_flow input(void *data, struct mad_stream *stream)
{
mp3_file *mp3fp;
int ret_code;
int unproc_data_size; /*the unprocessed data's size*/
int copy_size;
mp3fp = (mp3_file *)data;
if(mp3fp->fpos < mp3fp->flen) {
unproc_data_size = stream->bufend - stream->next_frame;
//printf("%d, %d, %d\n", unproc_data_size, mp3fp->fpos, mp3fp->fbsize);
memcpy(mp3fp->fbuf, mp3fp->fbuf + mp3fp->fbsize - unproc_data_size, unproc_data_size);
copy_size = BUFSIZE - unproc_data_size;
if(mp3fp->fpos + copy_size > mp3fp->flen) {
copy_size = mp3fp->flen - mp3fp->fpos;
}
fread(mp3fp->fbuf+unproc_data_size, 1, copy_size, mp3fp->fp);
mp3fp->fbsize = unproc_data_size + copy_size;
mp3fp->fpos += copy_size;
/*Hand off the buffer to the mp3 input stream*/
mad_stream_buffer(stream, mp3fp->fbuf, mp3fp->fbsize);
ret_code = MAD_FLOW_CONTINUE;
} else {
ret_code = MAD_FLOW_STOP;
}
return ret_code;
}

我们设置的输入buff缓冲区的大小是8192字节，但是对于mp3文件来讲，不一定这8192个字节就刚好是若干个完整的帧，有可能会有若干字节是输入下一个帧的，所有要根据struct mad_stream中的两个指针，标示了缓冲区中的完整帧的起始地址：

unsigned char const *this_frame; /* start of current frame */
unsigned char const *next_frame; /* start of next frame */

那么

unproc_data_size = stream->bufend - stream->next_frame;

得到剩余的下一个帧的数据，并且需要将其从buff数组的尾部拷贝到头部，再从mp3文件中读取一部分字节拼凑成下一个8192字节，提交给库去解码，如此周而复始。

4、此代码解码出来的pcm可以加上44字节的wav头文件，则可以用播放器正常播放。

五、如何从网络socket获取相应数据，边解码边播放

由于我的项目是要实现一个远程播放器的功能，即手机端的mp3源文件通过wifi传输到开发板上解码播放，所以，对于输入缓冲区的控制就不像操作文件那个，可以通过file结构体精确控制好读取的数据位置了，为此，做了些许修改。

可以开两个线程，一个线程用于接收socket数据，一个用于解码播放。主要是缓冲区的控制，可以如此实现：将接收buff[]大小设置为8192*10字节，然后，解码input函数里面的buff[]的大小设置为8192*11字节，也就是说，多余了8192用来缓冲多余的下一帧字节的数据（因为mp3文件的帧不会超过8192字节），那么，区别于上面的思路，我们可以固定的让socket的buff[]接收8192*10字节的数据，如果解码的buff[]里面初次解码后有剩余的数据，仍然将其复制到解码buff[]的头部，只是这时候还是将socket的buff[]的8192*10字节的数据加到解码buff[]的刚刚拷贝的数据后面，所以，这里调用mad_stream_buffer(stream, buf, bsize)中的bsize就是8192*10+剩余的帧数据大小了。