librosa | 易学教程

STFT understanding using librosa

阅读更多关于 STFT understanding using librosa

问题 I have an audio sample of about 14 seconds in 8khz Sample Rate. Im using librosa to extract some features from this audio file. y, sr = librosa.load(file_name) stft = np.abs(librosa.stft(y, n_fft=n_fft)) # file_length = 14.650022675736961 #sec # defaults # n_fft =2048 # hop_length = 512 # win_length/4 = n_fft/4 = 512 (win_length = n_fft default) #windowsTime = n_fft * Ts # (1/sr) stft.shape # (1025, 631) Specshow : librosa.display.specshow(stft, x_axis='time', y_axis='log') [![stft sr = 22050

STFT understanding using librosa

阅读更多关于 STFT understanding using librosa

Compare the similarity of 2 sounds using Python Librosa

阅读更多关于 Compare the similarity of 2 sounds using Python Librosa

问题 I have about 30 sound clips that are each a preset from a synthesizer. I want to compare these sounds to find out which ones are similar, and then sort the sounds so that each sound is adjacent in a list to 2 sounds that are similar to it. Frequency is not the only thing I want to look for. I would rather 2 saw waves which are a tone apart be considered similar that a saw wave and a sine wave which are the same note. These sounds would be considered similar for example Using librosa, I have

Compare the similarity of 2 sounds using Python Librosa

阅读更多关于 Compare the similarity of 2 sounds using Python Librosa

Sampling rate issue with Librosa

阅读更多关于 Sampling rate issue with Librosa

问题 When doing a STFT, and then an inverse STFT (iSTFT) on a 16 bits 44.1 khz audio file with the library Librosa : import librosa y, sr = librosa.load('test.wav', mono=False) y1 = y[0,] S = librosa.core.stft(y1) z1 = librosa.core.istft(S, dtype=y1.dtype) librosa.output.write_wav('test2.wav', z1, sr) the output is only a 22 khz audio file. Why? Where is there the sampling rate change in librosa ? 回答1: The librosa.load() function enables target sampling, wherein the audio file you import can be re

Sampling rate issue with Librosa

阅读更多关于 Sampling rate issue with Librosa

calculating FFT in frames and writing to a file

阅读更多关于 calculating FFT in frames and writing to a file

问题 I'm new to python,I'm trying get a FFT value of a uploaded wav file and return the FFT of each frame in each line of a text file (using GCP) using scipy or librosa Frame rate i require is 30fps wave file will be of 48k sample rate so my questions are how do i divide the samples for the whole wav file into samples of each frame How do add empty samples to make the length of the frame samples power of 2 (as 48000/30 = 1600 add 448 empty samples to make it 2048) how do i normalize the resulting

calculating FFT in frames and writing to a file

阅读更多关于 calculating FFT in frames and writing to a file

What is the second number in the MFCCs array?

阅读更多关于 What is the second number in the MFCCs array?

问题 When I extract MFCCs from an audio the ouput is (13, 22) . What does the number represent? Is it time frames ? I use librosa. The code is use is: mfccs = librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13, hop_length=256) mfccs print(mfccs.shape) And the ouput is (13,22) . 回答1: Yes, it is time frames and mainly depends on how many samples you provide via y and what hop_length you choose. Example Say you have 10s of audio sampled at 44.1 kHz (CD quality). When you load it with librosa, it

MFCC Python: completely different result from librosa vs python_speech_features vs tensorflow.signal

阅读更多关于 MFCC Python: completely different result from librosa vs python_speech_features vs tensorflow.signal

来源： https://stackoverflow.com/questions/60492462/mfcc-python-completely-different-result-from-librosa-vs-python-speech-features