Compare the similarity of 2 sounds using Python Librosa

问题

I have about 30 sound clips that are each a preset from a synthesizer. I want to compare these sounds to find out which ones are similar, and then sort the sounds so that each sound is adjacent in a list to 2 sounds that are similar to it. Frequency is not the only thing I want to look for. I would rather 2 saw waves which are a tone apart be considered similar that a saw wave and a sine wave which are the same note.

These sounds would be considered similar for example

Using librosa, I have been able to apply a Short-time Fourier transform to each of the sounds and create a spectrogram from each of them. Just by looking at the spectrograms, I am able to guess at which sounds might be similar and then confirm that guess by listening to the actual sounds, for example, sample_12 and sample_20 in the picture below

In the sort of these sounds, 12 and 20 would be close together

But I want to automate this process

From what I've looked up about Librosa, it looks like I can calculate a few things like the rms, mfcc, and centroids to determine similarity. But I don't know how to compare the values that I calculate.

rms = [librosa.feature.rms(S=s) for s in S]
centroids = [librosa.feature.spectral_centroid(y=y, sr=sr) for y in midiSamples]
mfccs = [librosa.feature.mfcc(y=y, sr=sr) for y in midiSamples]

At a very low level of abstraction, I don't know how to even compare the original stft transform values(2d arrays) other than making them into a spectrogram and manually deciding which ones look similar. How can I write code in order to sort either my stfts, rms, mfccs ... so that similar sounds end up beside each other in the sort.

I also have a jupyter notebook where I run through my program and explain how I don't know what to do with the data

回答1:

Sam, I think that you can compare two pictures with machine learning, or maybe with numpy as arrays of data.

This is just an idea for solution (not a full answer): if it is possible to convert two histograms to flat equal-sized arrays by numpy.ndarray.flatten

array1 = numpy.array([1.1, 2.2, 3.3])
array2 = numpy.array([1, 2, 3])
diffs = array1 - array2 # array([ 0.1,  0.2,  0.3])
similarity_coefficient = np.sum(diffs)

来源：https://stackoverflow.com/questions/64580500/compare-the-similarity-of-2-sounds-using-python-librosa

标签

python

audio

librosa