问题
I came across this nice tutorial https://github.com/manashmndl/DeadSimpleSpeechRecognizer where the data is trained based on samples separated by folders and all mfcc are calculated at once.
I am trying to achieve something similar but in a different way.
Based on this : https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html
librosa can compute mfcc for any audio. as follows :
import librosa
y, sr = librosa.load('test.wav')
mymfcc= librosa.feature.mfcc(y=y, sr =sr)
but I want to calculate mfcc for the audio part by part based on timestamps from a file.
the file has labels and timestamps as follows :
0.0 2.0 sound1
2.0 4.0 sound2
4.0 7.0 silence
7.0 11.0 sound1
I want to calculate mfcc of each range, my hope is to arrive at a labelled train data that looks like mfcc and its corresponding label.
mfcc_1 , sound1
mfcc_2, sound2
and so on.
How do I achieve this?
I looked at generate mfcc's for audio segments based on annotated file , and question is similar but I found both the question and answer somewhat hard to follow (because I'm very new to this field).
TIA
UPDATE: My Code :
import librosa
from subprocess import call
def ListDir():
call(["ls", "-l"])
def main():
ListDir()
readfile_return_segmentsmfcc()
my_segments =[]
# reading annotated file
def readfile_return_segmentsmfcc():
pat ='000.mp3'
y, sr = librosa.load(pat)
print "\n sample rate :"
print sr
with open("000.txt", "rb") as f:
for line in f.readlines():
start_time, end_time, label = line.split('\t')
start_time = float(start_time)
end_time = float(end_time)
label = label.strip()
my_segments.append((start_time, end_time, label))
start_index = librosa.time_to_samples(start_time)
end_index = librosa.time_to_samples(end_time)
required_slice = y[start_index:end_index]
required_mfcc = librosa.feature.mfcc(y=required_slice, sr=sr)
print "Mfcc size is {} ".format(mfcc.shape)
print start,end,label
return my_segments
main()
回答1:
read the start and end times:
start=2.0
end=4.0
convert to samples index using librosa.time_to_samples:
start_index = librosa.time_to_samples(start)
end_index = librosa.time_to_samples(end)
use python
[:]
operator to get the relevant slice from data:slice = y[int(start_index):int(end_index)]
compute mfcc on
slice
, etc.
来源:https://stackoverflow.com/questions/48513824/compute-mfcc-for-varying-time-intervals-based-on-time-stamps