问题
I have about 30 sound clips that are each a preset from a synthesizer. I want to compare these sounds to find out which ones are similar, and then sort the sounds so that each sound is adjacent in a list to 2 sounds that are similar to it. Frequency is not the only thing I want to look for. I would rather 2 saw waves which are a tone apart be considered similar that a saw wave and a sine wave which are the same note.
These sounds would be considered similar for example
Using librosa, I have been able to apply a Short-time Fourier transform to each of the sounds and create a spectrogram from each of them. Just by looking at the spectrograms, I am able to guess at which sounds might be similar and then confirm that guess by listening to the actual sounds, for example, sample_12 and sample_20 in the picture below
In the sort of these sounds, 12 and 20 would be close together
But I want to automate this process
From what I've looked up about Librosa, it looks like I can calculate a few things like the rms, mfcc, and centroids to determine similarity. But I don't know how to compare the values that I calculate.
rms = [librosa.feature.rms(S=s) for s in S]
centroids = [librosa.feature.spectral_centroid(y=y, sr=sr) for y in midiSamples]
mfccs = [librosa.feature.mfcc(y=y, sr=sr) for y in midiSamples]
At a very low level of abstraction, I don't know how to even compare the original stft transform values(2d arrays) other than making them into a spectrogram and manually deciding which ones look similar. How can I write code in order to sort either my stfts, rms, mfccs ... so that similar sounds end up beside each other in the sort.
I also have a jupyter notebook where I run through my program and explain how I don't know what to do with the data
回答1:
Sam, I think that you can compare two pictures with machine learning, or maybe with numpy as arrays of data.
This is just an idea for solution (not a full answer):
if it is possible to convert two histograms to flat equal-sized arrays
by numpy.ndarray.flatten
array1 = numpy.array([1.1, 2.2, 3.3])
array2 = numpy.array([1, 2, 3])
diffs = array1 - array2 # array([ 0.1, 0.2, 0.3])
similarity_coefficient = np.sum(diffs)
来源:https://stackoverflow.com/questions/64580500/compare-the-similarity-of-2-sounds-using-python-librosa