持续更新中……
python相关音频处理:【librosa】及其在音频处理中的应用。
简介
aubio是一个标注音乐和声音的python库,源码基于C语言。它能够读取任意媒体文件,提取特征并检测事件。(aubio is a collection of tools for music and audio analysis.)
适用于python2和python3,本文代码基于python3。
read audio from any media file, including videos and remote streams
high quality phase vocoder, spectral filterbanks, and linear filters
Mel-Frequency Cepstrum Coefficients and standard spectral descriptors
detection of note attacks (onset)
pitch tracking (fundamental frequency estimation)
beat detection and tempo tracking
音频读取
class aubio.source(path, samplerate=0, hop_size=512, channels=0)
src = aubio.source('test01.wav')
src.uri, src.samplerate, src.channels, src.duration #('test01.wav', 16000, 2, 86833)
snk = aubio.sink('out.wav') #Create a new sink at 44100Hz, mono
snk = aubio.sink('out.wav', samplerate=16000, channels=3) #Create a new sink at 32000Hz, stereo, write 100 samples into it
snk(aubio.fvec(100), 100)
pitch
pitch 和声音的基频 fundamental frequency(F0)有关,反应声音的音高信息,即声调。计算F0也称之为基频检测算法PDA。
理论计算参考
test-pitch C源码
aubiopitch C源码
# Supported methods: yinfft, yin, yinfast, fcomb, mcomb, schmitt, specacf, default (yinfft).
class aubio.pitch(method="default", buf_size=1024, hop_size=512, samplerate=44100)
其中,默认方法yinfft是yin方法的改进,yin方法具体见论文:
De Cheveigné, A., Kawahara, H. (2002) "YIN, a fundamental frequency estimator for speech and music", J. Acoust. Soc. Am. 111, 1917-1930.
Yinfft algorithm was derived from the YIN algorithm. In this implementation, a Fourier transform is used to compute a tapered square difference function, which allows spectral weighting. Because the difference function is tapered, the selection of the period is simplified.
具体方法参考论文:
Paul Brossier, Automatic annotation of musical audio for interactive systems, Chapter 3, Pitch Analysis, PhD thesis, Centre for Digital music, Queen Mary University of London, London, UK, 2006.
更多pitch detection方法的信息见pitch.h File Reference。
MFCC
mfcc creates a callable which takes a cvec as input. cvec is a container holding spectral data.
class aubio.cvec(size)
class aubio.mfcc(buf_size=1024, n_filters=40, n_coeffs=13, samplerate=44100)
If n_filters = 40, the filterbank will be initialized with filterbank.set_mel_coeffs_slaney()
. Otherwise, if n_filters is greater than 0, it will be initialized with filterbank.set_mel_coeffs()
using fmin = 0, fmax = samplerate.
buf_size = 2048; n_filters = 128; n_coeffs = 13; samplerate = 44100
mf = aubio.mfcc(buf_size, n_filters, n_coeffs, samplerate)
fftgrain = aubio.cvec(buf_size)
mf(fftgrain).shape #(13,)