mfcc | 易学教程

Error while importing scikits.talkbox

阅读更多关于 Error while importing scikits.talkbox

问题 I want to use scikits.talkbox, but i get the following error while import scikits.talkbox. Traceback (most recent call last): File "/home/seref/Desktop/machine learning codes/MFCC/main.py", line 3, in from scikits.talkbox.features.mfcc import mfcc File "/usr/local/lib/python3.5/dist-packages/scikits/talkbox/ init .py", line 3, in from tools import * ImportError: No module named 'tools' code sample import scipy.io.wavfile from scikits.talkbox.features.mfcc import mfcc sample_rate, X = scipy.io

Error while importing scikits.talkbox

阅读更多关于 Error while importing scikits.talkbox

Matching two series of Mfcc coefficients

阅读更多关于 Matching two series of Mfcc coefficients

问题 I have extracted two series MFCC coefficients from two around 30 second audio files consisting of the same speech content. The audio files are recorded at the same location from different sources. An estimation should be made whether the audio contains the same conversation or a different conversation. Currently I have tested a correlation calculation of the two Mfcc series but the result is not very reasonable. Are there best practices for this scenario? 回答1: I had the same problem and the

MFCC with Java Linear and Logarithmic Filters

阅读更多关于 MFCC with Java Linear and Logarithmic Filters

问题 I am implementing MFCC algorithm with Java. There is a sample code for triangular filters and MFCC at Java. Here is the link: MFCC Java However I should follow that code written in Matlab: MFCC Matlab My question is that at Matlab code it talks about linear and logarithmic filters however there is nothing about that at Java code . I should measure the performance of logarithmic and linear filters but I implemented that Java code and there is nothing about that. Also I didn't understand what

Having different results every run with GMM Classifier

阅读更多关于 Having different results every run with GMM Classifier

问题 I'm currently doing a speech recognition and machine learning related project. I have two classes now, and I create two GMM classifiers for each class, for labels 'happy' and 'sad' I want to train GMM classifiers with MFCC vectors. I am using two GMM classifiers for each label. (Previously it was GMM per file): But every time I run the script I am having different results. What might be the cause for that with same test and train samples? In the outputs below please note that I have 10 test

Simple word detector using MFCC

阅读更多关于 Simple word detector using MFCC

问题 I am implementing a software for speech recognition using Mel Frequency Cepstrum Coefficients. In particular the system must recognize a single specified word. Since the audio file I get the MFCCs in a matrix with 12 rows(the MFCCs) and as many columns as the number of voice frames. I make the average of the rows, so I get a vector with only the 12 rows (the ith-row is the average of all ith-MFCCs of all frames). My question is how to train a classifier to detect the word? I have a training

How to Merge MFCCs

阅读更多关于 How to Merge MFCCs

问题 I am working on extracting MFCC features from some audio files. The program I have currently extracts a series of MFCCs for each file and has a parameter of a buffer size of 1024. I saw the following in a paper: The feature vectors extracted within a second of audio data are combined by computing the mean and the variance of each feature vector element (merging). My current code uses TarsosDSP to extract the MFCCs, but I'm not sure how to split the data into "a second of audio data" in order

compute mfcc for varying time intervals based on time stamps

阅读更多关于 compute mfcc for varying time intervals based on time stamps

问题 I came across this nice tutorial https://github.com/manashmndl/DeadSimpleSpeechRecognizer where the data is trained based on samples separated by folders and all mfcc are calculated at once. I am trying to achieve something similar but in a different way. Based on this : https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html librosa can compute mfcc for any audio. as follows : import librosa y, sr = librosa.load('test.wav') mymfcc= librosa.feature.mfcc(y=y, sr =sr) but I want

Applying neural network to MFCCs for variable-length speech segments

阅读更多关于 Applying neural network to MFCCs for variable-length speech segments

问题 I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs. At the moment, I'm using 26 coefficients for each sample, and a total of 5 different classes - these are five different words with varying numbers of syllables. While each sample is 2 seconds long, I am unsure how to handle cases where the user can pronounce words either very slowly or very quickly. E.g., the word 'television' spoken within 1 second yields different coefficients than

How to perform DTW on an array of MFCC coefficients?

阅读更多关于 How to perform DTW on an array of MFCC coefficients?

问题 Currently I'm working on speech recognition project in MATLAB. I've taken two voice signals and have extracted the MFCC coefficients of the same. As far as I know, I should now calculate the Euclidean distance between the two and then apply the DTW algorithm. That's why I calculated the distnace between the two and got an array of the distances. So my question is how to implement DTW on resultant array? Here's my MATLAB code: clear all; close all; clc; % Define variables Tw = 25; % analysis