问题
I trying to implement a Mel Spectrogram feature extraction:
n_mels = 128
# Extracting MelFrequency Spectrum for every file
def extract_features(file_name):
try:
audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
mely = librosa.feature.melspectrogram(y=audio, sr=sample_rate, n_mels=n_mels)
except Exception as e:
print("Error encountered while parsing file: ", file)
return None
return mely.T
It appears that I am implementing this feature extraction incorrectly as when I check the x_test array it is (353,) and the x_train array is (1408,). The data is not correctly being parsed and an error is cast.
ERROR BEGIN
v = format % tuple(row) + newline
TypeError: only size-1 arrays can be converted to Python scalars
When I modify the extract_features
code to:
def extract_features(file_name):
try:
audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
mely = librosa.feature.melspectrogram(y=audio, sr=sample_rate, n_mels=n_mels)
melyscaled = np.mean(mely.T, axis=0)
except Exception as e:
print("Error encountered while parsing file: ", file)
return None
return melyscaled
The program works.
How to get the correct dimension from the definition code without doing any scaling? What does the np.mean
do to the feature extracted?
Also, how do you determine the correct value for n_mels
?
来源:https://stackoverflow.com/questions/65328571/how-do-you-determing-the-correct-dimension-of-mel-spectrogram-feature-extraction