1. 安装库
pip install python_speech_features
2. 代码:
#!/usr/bin/env python
from python_speech_features import logfbank
from python_speech_features import mfcc
from python_speech_features import delta
import scipy.io.wavfile as wav
import matplotlib.pyplot as plt
import numpy as np
(rate, sig) = wav.read("example.wav")
# log energy
fbank_feat = logfbank(sig,rate)
# mfcc
mfcc_feat = mfcc(sig, rate)
# delta
d_mfcc_feat = delta(mfcc_feat, 2)
# delta-delta
dd_mfcc_feat = delta(d_mfcc_feat,2)
x = np.linspace(1, 4096, 4096)
plt.figure('original signal')
plt.plot(x,sig[0:4096])
plt.show()
mfcc_colunm1 = mfcc_feat[:,0]
mfcc_row1 = mfcc_feat[0]
d_mfcc_feat_column1 = d_mfcc_feat[0]
dd_mfcc_feat_column1 = dd_mfcc_feat[0]
plt.figure()
plt.plot(mfcc_colunm1)
plt.show()
plt.figure()
plt.plot(mfcc_row1)
plt.show()
plt.figure()
plt.plot(d_mfcc_feat_column1)
plt.show()
plt.figure()
plt.plot(dd_mfcc_feat_column1)
plt.show()
3.更多设置
这里用到4个函数分别求mfcc,delta和delta-delta系数。
(1) 函数 fbank
这个函数用来求经过梅尔滤波器组后的能量
定义如下:
def fbank(signal,samplerate=16000,winlen=0.025,winstep=0.01,
nfilt=26,nfft=512,lowfreq=0,highfreq=None,preemph=0.97
有9个参数,默认值如下表,实际应用中需要根据实际的输入决定输入参数。
(2) 函数 mfcc
用来求MFCC
定义如下:
mfcc(signal,samplerate=16000,winlen=0.025,winstep=0.01,numcep=13,
nfilt=26,nfft=512,lowfreq=0,highfreq=None,preemph=0.97,
ceplifter=22,appendEnergy=True)
有12个参数参数,默认参数值,实际应用中需要根据实际的输入决定输入参数。
(3) 函数 delta
求速度系数和加速度系数
4.如何求mfcc,delta 和 delta-delta系数
可以参考:
https://blog.csdn.net/qq_23869697/article/details/79280182
参考Github: https://github.com/jameslyons/python_speech_features