zoukankan      html  css  js  c++  java
  • FFMpeg笔记(三) 音频处理基本概念及音频重采样

        Android放音的采样率固定为44.1KHz,录音的采样率固定为8KHz,因此底层的音频设备驱动需要设置好这两个固定的采样率。如果上层传过来的采样率不符的话,需要进行resample重采样处理。

        几个名词:

    1. 采样率

        采样设备每秒抽取样本的次数

    2. 音频格式及量化精度(位宽)

        每种音频格式有不同的量化精度(位宽),位数越多,表示值就越精确,声音表现自然就越精准。FFMpeg中音频格式有以下几种,每种格式有其占用的字节数信息:

    enum AVSampleFormat {
        AV_SAMPLE_FMT_NONE = -1,
        AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits
        AV_SAMPLE_FMT_S16,         ///< signed 16 bits
        AV_SAMPLE_FMT_S32,         ///< signed 32 bits
        AV_SAMPLE_FMT_FLT,         ///< float
        AV_SAMPLE_FMT_DBL,         ///< double
    
        AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar
        AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar
        AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar
        AV_SAMPLE_FMT_FLTP,        ///< float, planar
        AV_SAMPLE_FMT_DBLP,        ///< double, planar
        AV_SAMPLE_FMT_S64,         ///< signed 64 bits
        AV_SAMPLE_FMT_S64P,        ///< signed 64 bits, planar
    
        AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically
    };

     3. 分片(plane)和打包(packed)

        以双声道为例,带P(plane)的数据格式在存储时,其左声道和右声道的数据是分开存储的,左声道的数据存储在data[0],右声道的数据存储在data[1],每个声道的所占用的字节数为linesize[0]和linesize[1];

        不带P(packed)的音频数据在存储时,是按照LRLRLR...的格式交替存储在data[0]中,linesize[0]表示总的数据量。

    4. 声道分布(channel_layout)

        声道分布在FFmpeglibavutilchannel_layout.h中有定义,一般来说用的比较多的是AV_CH_LAYOUT_STEREO(双声道)和AV_CH_LAYOUT_SURROUND(三声道),这两者的定义如下:

    #define AV_CH_LAYOUT_STEREO            (AV_CH_FRONT_LEFT|AV_CH_FRONT_RIGHT)
    #define AV_CH_LAYOUT_SURROUND          (AV_CH_LAYOUT_STEREO|AV_CH_FRONT_CENTER)

    5. 音频帧的数据量计算

        一帧音频的数据量=channel数 * nb_samples样本数 * 每个样本占用的字节数

        如果该音频帧是FLTP格式的PCM数据,包含1024个样本,双声道,那么该音频帧包含的音频数据量是2*1024*4=8192字节。

    6. 音频播放时间计算

        以采样率44100Hz来计算,每秒44100个sample,而正常一帧为1024个sample,可知每帧播放时间/1024=1000ms/44100,得到每帧播放时间=1024*1000/44100=23.2ms。

    7. 音频重采样(resample)

        FFMpeg自带的resample例子:FFmpegdocexamples esampling_audio.c,这里把最核心的resample代码贴一下,在工程中使用时,注意设置的各种参数,给定的输入数据都不能错。

    int main(int argc, char **argv)
    {
    // 设置数据源src和dst声道布局 int64_t src_ch_layout
    = AV_CH_LAYOUT_STEREO, dst_ch_layout = AV_CH_LAYOUT_SURROUND;
    // 设置src和dst采样率
    int src_rate = 48000, dst_rate = 44100; uint8_t **src_data = NULL, **dst_data = NULL; int src_nb_channels = 0, dst_nb_channels = 0; int src_linesize, dst_linesize; int src_nb_samples = 1024, dst_nb_samples, max_dst_nb_samples;
    // 设置src和dst音频格式
    enum AVSampleFormat src_sample_fmt = AV_SAMPLE_FMT_DBL, dst_sample_fmt = AV_SAMPLE_FMT_S16; const char *dst_filename = NULL; FILE *dst_file; int dst_bufsize; const char *fmt;
    // 重采样上下文,包含resample信息
    struct SwrContext *swr_ctx; double t; int ret; if (argc != 2) { fprintf(stderr, "Usage: %s output_file " "API example program to show how to resample an audio stream with libswresample. " "This program generates a series of audio frames, resamples them to a specified " "output format and rate and saves them to an output file named output_file. ", argv[0]); exit(1); }
    // resample后的数据保存到本地文件 dst_filename
    = argv[1]; dst_file = fopen(dst_filename, "wb"); if (!dst_file) { fprintf(stderr, "Could not open destination file %s ", dst_filename); exit(1); } /* create resampler context */ swr_ctx = swr_alloc(); if (!swr_ctx) { fprintf(stderr, "Could not allocate resampler context "); ret = AVERROR(ENOMEM); goto end; } /* set options */ // 将resample信息写入resample上下文
    av_opt_set_int(swr_ctx,
    "in_channel_layout", src_ch_layout, 0); av_opt_set_int(swr_ctx, "in_sample_rate", src_rate, 0); av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", src_sample_fmt, 0); av_opt_set_int(swr_ctx, "out_channel_layout", dst_ch_layout, 0); av_opt_set_int(swr_ctx, "out_sample_rate", dst_rate, 0); av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", dst_sample_fmt, 0); /* initialize the resampling context */ if ((ret = swr_init(swr_ctx)) < 0) { fprintf(stderr, "Failed to initialize the resampling context "); goto end; } /* allocate source and destination samples buffers */ src_nb_channels = av_get_channel_layout_nb_channels(src_ch_layout); ret = av_samples_alloc_array_and_samples(&src_data, &src_linesize, src_nb_channels, src_nb_samples, src_sample_fmt, 0); if (ret < 0) { fprintf(stderr, "Could not allocate source samples "); goto end; } /* compute the number of converted samples: buffering is avoided * ensuring that the output buffer will contain at least all the * converted input samples */ max_dst_nb_samples = dst_nb_samples = av_rescale_rnd(src_nb_samples, dst_rate, src_rate, AV_ROUND_UP); /* buffer is going to be directly written to a rawaudio file, no alignment */ dst_nb_channels = av_get_channel_layout_nb_channels(dst_ch_layout); ret = av_samples_alloc_array_and_samples(&dst_data, &dst_linesize, dst_nb_channels, dst_nb_samples, dst_sample_fmt, 0); if (ret < 0) { fprintf(stderr, "Could not allocate destination samples "); goto end; } t = 0; do { /* generate synthetic audio */
    // 这里是自行生成源数据帧,实际工程中应该将解码后的PCM数据填入src_data中 fill_samples((double *)src_data[0], src_nb_samples, src_nb_channels, src_rate, &t); /* compute destination number of samples */ dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) + src_nb_samples, dst_rate, src_rate, AV_ROUND_UP); if (dst_nb_samples > max_dst_nb_samples) { av_freep(&dst_data[0]); ret = av_samples_alloc(dst_data, &dst_linesize, dst_nb_channels, dst_nb_samples, dst_sample_fmt, 1); if (ret < 0) break; max_dst_nb_samples = dst_nb_samples; } /* convert to destination format */ // 重采样操作
    ret
    = swr_convert(swr_ctx, dst_data, dst_nb_samples, (const uint8_t **)src_data, src_nb_samples); if (ret < 0) { fprintf(stderr, "Error while converting "); goto end; } dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels, ret, dst_sample_fmt, 1); if (dst_bufsize < 0) { fprintf(stderr, "Could not get sample buffer size "); goto end; } printf("t:%f in:%d out:%d ", t, src_nb_samples, ret); fwrite(dst_data[0], 1, dst_bufsize, dst_file); } while (t < 10); if ((ret = get_format_from_sample_fmt(&fmt, dst_sample_fmt)) < 0) goto end; fprintf(stderr, "Resampling succeeded. Play the output file with the command: " "ffplay -f %s -channel_layout %"PRId64" -channels %d -ar %d %s ", fmt, dst_ch_layout, dst_nb_channels, dst_rate, dst_filename); end: fclose(dst_file); if (src_data) av_freep(&src_data[0]); av_freep(&src_data); if (dst_data) av_freep(&dst_data[0]); av_freep(&dst_data); swr_free(&swr_ctx); return ret < 0; }
  • 相关阅读:
    5-1 CSS命名规范
    npm
    Maven的安装与配置
    Emmet
    计算机常识——IP/TCP协议
    判别分析——距离判别
    R语言创建空向量、矩阵
    Rstudio——基本功能及操作
    R语言——source函数
    R语言关于warning问题——关于options函数
  • 原文地址:https://www.cnblogs.com/jiayayao/p/8724663.html
Copyright © 2011-2022 走看看