zoukankan      html  css  js  c++  java
  • AAC相关知识zt

    http://fk323.blog.hexun.com/11967222_d.html



    AAC实际上是高级音频编码的缩写,目前只有苹果的硬盘式MP3支持这一种格式。AAC是由Fraunhofer IIS-A、杜比和AT&T共同开发的一种音频格式,它是MPEG-2规范的一部分。AAC所采用的运算法则与MP3的运算法则有所不同,AAC 通过结合其他的功能 来提高编码效率。AAC的音频算法在压缩能力上远远超过了以前的一些压缩算法(比如MP3等)。它还同时支持多达48个音轨、15个低频音轨、更多种采样 率和比特率、多种语言的兼容能力、更高的解码效率。总之,AAC可以在比MP3文件缩小30%的前提下提供更好的音质。 
     
    AAC(高级音频编码)
    AAC(高 级音频编码技术 Advanced Audio Coding),出现于1997年,是基于MPEG-2的音频编码技术。由Fraunhofer IIS、杜比、苹果、AT&T、索尼等公司共同开发,以取代mp3格式。2000年,MPEG-4标准出台,AAC从新整合了其特性,故现又称 MPEG-4 AAC,即m4a。
    作为一种高压缩比的音频压缩算法,AAC通常压缩比为18:1,也有资料说为20:1,远胜mp3,而音质 由于采用多声道,和使用低复杂性的描述方式,使其比几乎所有的传统编码方式在同规格的情况下更胜一筹。不过直到2006年,使用这一格式储存音频的并不 多,可以播放该格式的mp3播放器更是少之又少,目前所知仅有苹果iPod,而手机支持AAC的相对要多一些,此外电脑上很多音频播放软件都支持AAC格 式,如苹果iTunes。
    AAC所采用的运算法则与MP3的运算法则有所不同,AAC通过结合其他的功能来提高编码效率。AAC的音频算法在 压缩能力上远远超过了以前的一些压缩算法(比如MP3等)。它还同时支持多达48个音轨、15个低频音轨、更多种采样率和比特率、多种语言的兼容能力、更 高的解码效率。号称「最大能容纳48通道的音轨,采样率达96 KHz,并且在320Kbps的数据速率下能为5.1声道音乐节目提供相当于ITU-R广播的品质」。
    总之,AAC可以在比MP3文件节省大约30%的储存空间与带宽的前提下提供更好的音质。但是在空间上和结构上AAC和mp3编码出来后的风格不太一样,喜欢与否属于仁者见仁智者见智的事情。


    AAC音频的来源及特性

      其实,AAC的技术早在1997年就成型了,当时被称为MPEG-2 AAC,但是随着2000年MPEG-4音频标准的出台,MPEG-2 AAC被用在这一标准中,同时追加了一些新的编码特性,所以它就改称为MPEG-4 AAC。与MP3不同,AAC的技术掌握在多家厂商手中,这使得AAC编码器非常多,既有纯商业的编码器,也有完全免费的编码器。纯商业的编码器如 Fraunhofer IIS的FhG、杜比公司的Dolby AAC,免费的有Free AAC、苹果公司的iTune,Nero也通过它的Nero 6提供了Nero AAC。

      AAC是一种高压缩比的音频压缩算法,它的压缩比可达20:1,远远超过了AC-3、MP3等较老的音频压缩算法。一般认为,AAC格式在 96Kbps码率的表现超过了128Kbps的MP3音频。AAC另一个引人注目的地方就是它的多声道特性,它支持1~48个全音域音轨和15个低频音 轨。除此之外,AAC最高支持96KHz的采样率,其解析能力足可以和DVD-Audio的PCM编码相提并论,因此,它得到了DVD论坛的支持,成为了 下一代DVD的标准音频编码。

      AAC的家族非常庞大,有9种规格,可适应不同场合应用的需要。其中LC低复杂性规格去掉了预测和增益控制模块,降低了复杂度,提高编码效率,是目前使用得最多的规格。


    AAC 定义方法及其算法复杂度
    AAC 经常让大家摸不到头脑,而且很多工具对 AAC 版本的叫法千奇百怪,甚至有些编码器/播放器甚至
      误导大家。例如 有些将 HE AAC 认作 AAC-LC, 其实也没有错,但是很不精确。 下面是一份对AAC家族相关
      叫法的一个明确:
      
       AAC = MPEG2 AAC ~= MP3 + TNS + TP (It is not an upgrade of MP3 since it is not backward compatible but uses all MP3's features in a better way).
      
      MPEG4 AAC = MPEG2 AAC + LTP + PNS
      There are several profliles depending on the decoding/encoding complexity, required power, delay, bandwith characteristics, error resilience characteristics, etc... The most used profile in the PC arena is the AAC LC (Low Complexity) = MPEG4 AAC without LTP.
      
      HE-AAC = SBR + AAC LC
       Coding Technologies, developers of SBR, named this coding aacPlus&S482;, also known as AAC+, HE-AAC, AACP, AAC-LC+SBR, etc... SBR technology was prevously introduced in the MP3pro codec.
      
      HE-AAC v2= PS + HE-AAC
       Coding Technologies, developers of the MPEG Parametric Stereo, named this coding aacPlus&S482; v2 as a new revision of the previous release. It is also known as AAC++, EAAC+, Enhanced HE-AAC, EAACP, HE-AAC+PS, etc... Recently it was standarized by ISO as HE-AAC v2.
      
      S-AAC...(Just guessing, not yet released but in Reference Model 0 stage)
       Since MPEG is focusing in multichannel, the next standard will be something based in the Spatial Audio Coding tool standarized as MPEG Surround, that allows to do someting similar to PS but aimed to 5.1ch or 7.1ch content. This could be named as S-AAC, AAC Surround or AACS, Surround HE-AAC, [Put your favorite name here]. There isn't an official name for it yet.
      
      Terms and acronyms:
      
      AAC Advanced Audio Coding, developed by Dolby Laboratories.
      
       TNS Temporal Noise Shaping is a tool designed to control the location, in time, of the quantization noise by transmission of filtering coefficients.
      
      TP Temporal Prediction is a tool designed to enhance compressibility of stationnary signals.
      
       LTP Long Term Prediction is once again a prediction tool. This one requires less computation power but it is far more complex than the one used in MPEG-2 AAC, while providing comparable coding performance.
      
       PNS Perceptual Noise Substitution, allows to replace coding of noise-like parts of the signal by some noise generated on the decoder side, so the decoding result is not deterministic among multiple decoding processes of the same encoded data.
      
      SBR Spectral Band Replication is a tool that creates associated higher frequency content based on the lower frequencies and coding it as statistical information: level, distribution and ranges. Each of these parameters is encoded separately, taking account of their distinctive characteristics. It involves reconstruction of a noise-like frequency spectrum by employing a noise generator with some statistical information (level, distribution, ranges), so the decoding result is not deterministic among multiple decoding processes of the same encoded data. Both ideas are based on the principle that the human brain tends to consider high frequencies to be either harmonic phenomena associated with lower frequencies or noise, and is thus less sensitive to the exact content of high frequencies in audio signals.
      
      PS Parametric Stereo, the stereo image information is separated from the mono signal being represented as a small amount of high quality parametric stereo information. The scheme relies on dissecting the incoming audio signal into three ‘objects’ that are a common constituent of all audio signals: transients, sinusoids and noise The stereo information is efficiently parameterized. Each of these objects is encoded separately, taking account of their distinctive characteristics. Like PNS and SBR the decoding result is not deterministic among multiple decoding processes of the same encoded data.
      
      SAC Spatial Audio Coding exploits inter-channel differences in level, phase and coherence to capture the spatial image of a multi-channel audio signal relative to a transmitted downmix signal. It encodes each of these cues separately taking account of their distinctive characteristics such that the cues, and the transmitted signal, can be decoded to synthesize a high quality multi-channel representation allowing higher compression than separate channel coding.
      
      AAC 定点解码器之 算法复杂度分析
      
      [测试环境]: Sparc V8 CPU @ 128MHz, 无乘法器, 32K DCache.
      [测试向量]: AAC LC+SBR, FAAC编码
      [测试结果]:
      AAC decoder total: 00003c21b cycles
      requant: 000004225 cycles 6.8%
      spectral_data: 00000618b cycles 10.1%
      scalefactors: 000000a21 cycles 1.0%
      output: 000000000 cycles 0.0%
      FB: 000022dd0 cycles 58.0%
      MDCT[0256]: 000005fd5 cycles 9.9%
      CFFT[0064]: 0000047e6 cycles 7.4%
      MDCT[2048]: 000008663 cycles 13.9%
      CFFT[0512]: 0000091fd cycles 15.2%
      MDCT[1024]: 000000000 cycles 0.0%
      CFFT[0256]: 000000000 cycles 0.0%
      
      综上所述,FilterBank 是 SBR中最复杂的算法。
    loop's blog
  • 相关阅读:
    python计算机基础
    计算机基础知识
    ftok函数
    可重入函数与不可重入函数
    Redis学习资料整理
    小记6月18
    libxml2简单的生成、解析操作
    切换日志是否更新检查点?
    第五课 数据备份恢复实验
    第四课 Grid Control实验 GC Agent安装(第一台机器部署) 及卸载
  • 原文地址:https://www.cnblogs.com/goodloop/p/1308032.html
Copyright © 2011-2022 走看看