zoukankan      html  css  js  c++  java
  • MPEG4 SSC备选标准SiCAS

    Sinusoidal coding for audio and speech (siCAS) (DEL.4625)

    Project nummer: del4625

    Omschrijving van het onderzoek

    Sinusoidal coding, which aims at modeling a signal as a sum of sinusoids has proven to be a promising way to conceive new compression strategies for the purpose of coding speech signals which, indeed, have an undeniable periodic behaviour. Audio coders (speech + music) on the other hand, do not exploit signal features to the same extent and, therefore, need different coding strategies. However, as sinusoidal coding itself does not depend on data-specific properties, it can be applied to audio signals as well. In fact, after some modifications of the classical sinusoidal model, this coding technique results in an efficient and robust representation of audio signals.
    Recently, the proponents have extended the classical sinusoidal model to allow the amplitudes of the sinusoids to evolve exponentially. This extended model results in a more efficient representation of the given data and, even more importantly, has the potential to efficiently represent so-called "attacks" or "transients" in a very compact manner, the bottle neck in state of the art audio coders. The capability of encoding the latter class makes this scheme a very promising candidate for compressing audio signals.
    Determining the non-constant modulus exponentials can not be done by relying on traditional FFT-based algorithms which are in vogue in current sinusoidal speech coding schemes. The reason is that these algorithms take for granted that the sinusoidal components have an almost constant envelope.
    The generalized model, in contrast, entails to more advanced signal estimation and modeling algorithms. In the past decade, the Circuits and Systems Section at the Department of Electrical Engineering of Delft University of Technology (DUT) has contributed significantly to the development of novel techniques and algorithms for high-resolution signal estimation and spectral analysis. The approach is deterministic and extracts signal parameters by decomposing certain matrices that are built on the observed or given data. In a number of preliminary experiments, the proponents have shown that these novel algorithms are appropriate alternatives to the FFT-based algorithms when extending the sinusoidal model to the above mentioned generalized model.
    Starting from this perspective, we propose to investigate the merits and practical usefulness of some candidates among the high-resolution signal estimation algorithms that are currently believed to be promising from both conceptual and practical coding perspective view points.

    Specifically, the project aims at the following main results:

    1. the specification of a viable application scenario. This includes the specification of the complete system, benchmarks and performance requirements
    2. the classification and characterization of audio and speech signals in terms of the optimal modeling of these signals, in particular as sums of sinusoids with exponentially evolving envelopes
    3. the selection of one or more algorithms appropriate for the segmentation of the audio and speech signals to be modeled, and one or more algorithms for the subsequent determination of the model order and the model parameters
    4. the analysis of the complexity of the envisioned algorithms. This includes the analysis of the numerical stability of the algorithms, as well as the sensitivity in the presence of noise. It also includes modifications and simplifications of the algorithms, if implementation constraints require so
    5. the design of a coder. This includes the design of the quantization and reconstruction algorithms, the lossless encoding and decoding algorithms and the bit-rate control algorithm
    6. development and inclusion of psychoacoustic models for quality aspects of above-threshold noise, which will be used in the coding scheme to guide bit distribution
    7. the incorporation of the sinusoidal coder in a hybrid coding structure.

    This project brings together highly qualified partners with expertise in the domains of high-resolution signal modeling and signal parameter estimation (Delft University of Technology), speech and audio coding (Philips Research Eindhoven) and speech processing (Royal Institute of Technology, Stockholm). Dr. P. Kroon (AT&T) and prof. B. Kleijn (KTH Stockholm) received their Ph.D. degree at DUT in the area of speech coding. The speech coding algorithm used in GSM is developed at the Circuits and Systems Section at DUT (dr. Kroon with support of STW and Philips). A standardized coding algorithm used in the US for mobile communication is developed by prof. Kleijn. Because of the strong interaction between model complexity, removal of irrelevancy and encoding, the work at Delft University, KTH and Philips Research has to be done in close collaboration.

    Gebruikers

    Er zijn één bedrijf en twee andere (buitenlandse) universiteiten bij dit project betrokken.

    Projectleider

    Dr.ir. E.F.A. Deprettere
    Technische Universiteit Delft
    Fac. Informatietechnologie en Systemen
    Elektrotechniek
    Postbus 5031
    2600 GA Delft

    Status van het project

    Gestart : 30-06-1998
    Einddatum : 16-01-2004

    Trefwoorden

    Audio; Signaalbewerking; Spraak; Spraakcodering.

  • 相关阅读:
    赠与今年的大学毕业生(胡适先生30年代的文章,仍不过时)
    统一管理磁盘上的开源代码
    生成sqlite导入库的做法
    提高二维矢量绘图效率之一般做法
    boost库命名规则的优点
    如何把腾讯微博挂到CSDN博客上
    fatal error C1902 Program database manager mismatch; please check your installation问题的解决
    智能指针变量做函数参数的一个值得注意的地方
    PC会消亡吗?
    软件制造问题的微软答案
  • 原文地址:https://www.cnblogs.com/gaozehua/p/2443549.html
Copyright © 2011-2022 走看看