MATLAB code for a lot of Gaussian Mixture Model

泪湿孤枕 提交于 2019-12-04 15:17:49

Do you want use it for Speech processing? If yes , the best way is use of MSR Identity Toolkit . this toolkit is written by Dr. Omid Sadjadi as Microsoft Researcher. He guided me how to use it.( also you need Voicebox too). Here is an example code snippet that you may use to extract MFCCs from speech files in wav files (assuming 16 kHz sample rate):

addpath('path_to_voicebox');
addpath('path_to_identity_toolbox');
[s, fs] = wavread(speechFilename);
fL = 100.0/fs; 
fH = 8000.0/fs; 
fRate = 0.010 * fs; 
fSize = 0.025 * fs; 
nChan = 27; 
nCeps = 12; 
premcoef = 0.97;
s = rm_dc_n_dither(s, fs); 
s = filter([1 -premcoef], 1, s); 
mfc = melcepst(s, fs, '0dD', nCeps, nChan, fSize, fRate, fL, fH);
mfc = cmvn(mfc', true);
writehtk(featureFilename, mfc', 100000, 9);

The above code extracts 39-dimensional MFCCs from pre-emphasized speech signal, and then mean and variance normalizes the features, and finally writes them to disk in HTK format. Note that this is just an example code and you may modify this code based on your needs/rescources. The two functions "rm_dc_n_dither" and "cmvn" are from the Identity Toolbox. Both Voicebox and Identity Toolbox should be in MatLab path (see the first two lines of the above code). For voice activity detection (VAD), you can use the "vadsohn" function from Voicebox that outputs frame level decisions (0 for silence and 1 for speech) at 10 ms frame skip-rate.

After you extract the features from your database, you may follow the procedures in gmm_ubm_demo provided with the Identity Toolbox to train a UBM model.

In case you would like to replicate our demo results on TIMIT, you may download the list files (not included in the toolbox) from below address:

http://www.utdallas.edu/~sadjadi/lists.tar.gz

It is very easy and you do it with normal pc .

Regards Mohammad Karaminejad karaminejad@gmail.com

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!