zoukankan      html  css  js  c++  java
  • 转迈阿密大学的一个项目:28.8Kpbs下的MPEG Layer2 编码方案

    名称:Methods to Reduce the Bandwidth Requirements of MPEG-1 Layer II Audio Data for Transmission Speeds of Less Than 28.8 kbps

    http://www.music.miami.edu/programs/mue/research/klampert/cover.html

    作者:Kirk Lampert


    该论文仅仅是提出一些设计方案,并没有形成可信赖的结果,但他的一些对MPEG2 进一步压缩的方案值得思考,他的Delta Coding,(VLBR算法)。

    结论引用并思考

    7. Conclusions

    The VLBR algorithms were tested against Progressive Network's Real Audio encoder and VocalTec's I-Wave encoder. Within the limits of MPEG-1 Layer II the encoder managed to reach a level of performance almost equivalent to, and sometimes matching that of the I-Wave encoder while using less bandwidth. Earlier in this project the VLBR files were tested against Real Audio version 2.0. The quality of almost all of the VLBR files is equivalent to a 28.8 kbps Real Audio 2.0 file and much better than a 14.4 kbps Real Audio 2.0 file. Real Audio 3.0 provides much greater sound quality, surpassing both Real Audio 2.0 and, as was shown in this project, the VLBR encoding scheme.

    This project demonstrated that, for certain music types, further bandwidth reduction is possible while retaining an acceptable quality level; meaning the quality is good enough to allow enjoyment by the listener without becoming too distracting. This is done by using the three controls developed in this project to tweak the encoding process until the resulting files sound acceptable at the desired bitrate. In most cases the preferences of the listening panel (shown in Figure 7) did not match that of the author. This indicates that a better method of testing is to have the panel listen to the samples a number of times throughout the process until the quality results are as high as possible. Figure 7 shows that the panel preferred high frequency content with more noise over the alternative of less noise with less high frequencies. Higher frequencies could have been added (at the cost of more noise) with the three controls and may have yielded higher quality ratings.

    The MPEG-1 Layer II compatible tools that were created are unavailable in other encoders. These tools give the user more control over the very low bitrate encoding process, allowing the user to ensure the encoded files sound as good as possible. Without these tools it is neither possible to encode MPEG-1 files with a bandwidth of less than 32 kbps, nor is it possible to actively control how the encoder chooses to allocate bits.

    The most useful of these tools was the four sets of subband bit allocation scalars. The encoder typically keeps the quantization noise at a constant level across the eight subbands. These scalars allow the noise level of a particular subband to be raised or lowered by a constant amount. Without fluctuations in the noise level among individual subbands the listener is more readily able to begin ignoring it and focusing on the music instead.

    The temporal capabilities of the four sets of subband bit allocation scalars were not as useful as anticipated. Because one MPEG-1 Layer II frame is 24ms to 28ms long it is easy for the listener to perceive the frequency changes imposed by a subband that is essentially being turned on and off at a fixed frequency. To remedy this the "on / off" frequency must be raised or lowered so as not to be perceived up by the listener. This frequency cannot be raised because a subband cannot be turned on and off more than one frame at a time. It can be lowered by leaving a subband on for a specified number of frames and off for the same number of frames. The slowness by which the subbands are switched in and out, however, prevents the ear from being tricked into not noticing the missing subbands. In some cases, such as the voice sample, the switching in and out of a subband seemed to match up nicely with the source material. The temporal timing can be set somewhat so that the "on cycle" of certain subbands coincides with syllables where that subband is preferable. Obviously this takes much patience and luck and is not an effective way to encode audio files.

    The maximum bit allocation control was useful in certain cases. Unlike the four sets of subband bit allocation scalars this control can hard limit the number of bits that are allocated to a subband. This "headroom" setting may or may not be effective depending on whether the encoder is actually allocating more than the number of bits the limit is set to. The advantage to this control is that if the limit for a subband is reached then the remaining bits in the bit pool can be allocated to another subband. In this way a certain subband can be "doped" where it wouldn抰 be otherwise. Thus if the user knows a certain subband should receive more bits, the user has the ability to force the encoder to do so.

    The least useful of all the controls was the bit pool scalar. This was unexpected at first since the MPEG encoder is supposed to use whatever bits it has as efficiently as possible. In every case where the bit pool was scaled the files contained a large number of artifacts making them unusable. Closer inspection, however, reveals why this happens. As noted, a 32 kbps MPEG-1 file only uses the lower eight subbands because there are not enough bits to efficiently encode more. The same applies for the VLBR files. If the number of bits in the bit pool is reduced to half, the encoder is not able to efficiently encode all eight subbands. This provides some validation for the other two controls which can be used to more intelligently allocate the bits. One could simply remove the upper five subband to achieve 18 kbps. Five must be removed rather than four because the bit allocation is not even; subbands one and two typically receive the most bits. While this would work, the resulting files would lack sufficient high frequency content. The algorithms developed in this project use more than the lower three subbands to gain more high frequencies. The cost of this is audible quantization noise in the lower subbands (where bits were taken from) and in the upper subbands (where only a few bits could be added).

    In summary this project created three new tools that allow standard MPEG-1 Layer II stream decoders to receive moderate quality low bandwidth files over standard 28.8 kbps modem connections. All three tools, when used in conjunction with the coolmpeg.txt output file, can also be used as a learning tool to gain a greater understanding of the MPEG-1 encoding process.

  • 相关阅读:
    突然地心血来潮,为 MaixPy( k210 micropython ) 添加看门狗(WDT) C 模块的开发过程记录,给后来的人做开发参考。
    Vular开发手记#1:设计并实现一个拼插式应用程序框架
    VUE实现Studio管理后台(完结):标签式输入、名值对输入、对话框(modal dialog)
    VUE实现Studio管理后台(十三):按钮点选输入控件,input输入框系列
    VUE实现Studio管理后台(十二):添加输入组合,复杂输入,输入框Input系列
    VUE实现Studio管理后台(十一):下拉选择列表(Select)控件,输入框input系列
    VUE实现Studio管理后台(十):OptionBox,一个综合属性输入界面,可以级联重置
    VUE实现Studio管理后台(九):开关(Switch)控件,输入框input系列
    VUE实现Studio管理后台(八):用右键菜单contextmenu,编辑树形结构
    VUE实现Studio管理后台(七):树形结构,文件树,节点树共用一套代码NodeTree
  • 原文地址:https://www.cnblogs.com/gaozehua/p/2322704.html
Copyright © 2011-2022 走看看