librosa | 易学教程

How to read audio file from google cloud storage bucket and play with ipd in a datalab notebook

阅读更多关于 How to read audio file from google cloud storage bucket and play with ipd in a datalab notebook

问题 I want to play a sound file in a datalab notebook which I read from a google cloud storage bucket. How to do this? 回答1: import numpy as np import IPython.display as ipd import librosa import soundfile as sf import io from google.cloud import storage BUCKET = 'some-bucket' # Create a Cloud Storage client. gcs = storage.Client() # Get the bucket that the file will be uploaded to. bucket = gcs.get_bucket(BUCKET) # specify a filename file_name = 'some_dir/some_audio.wav' # read a blob blob =

How to use a context window to segment a whole log Mel-spectrogram (ensuring the same number of segments for all the audios)?

阅读更多关于 How to use a context window to segment a whole log Mel-spectrogram (ensuring the same number of segments for all the audios)?

问题 I have several audios with different duration. So I don't know how to ensure the same number N of segments of the audio. I'm trying to implement an existing paper, so it's said that first a Log Mel-Spectrogram is performed in the whole audio with 64 Mel-filter banks from 20 to 8000 Hz, by using a 25 ms Hamming window and a 10 ms overlapping. Then, in order to get that I have the following code lines: y, sr = librosa.load(audio_file, sr=None) #sr = 22050 #len(y) = 237142 #duration = 5

Why is image stored different than the one imshowed?

阅读更多关于 Why is image stored different than the one imshowed?

问题 I am currently not able to understand why I am not able to recreate the plot after I store the data.. import os import sys from os import listdir from os.path import isfile, join import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import seaborn as sb from matplotlib.colors import Normalize import matplotlib from matplotlib import cm from PIL import Image import librosa import librosa.display import ast def make_plot_store_data(name,interweaved): librosa

compute mfcc for varying time intervals based on time stamps

阅读更多关于 compute mfcc for varying time intervals based on time stamps

问题 I came across this nice tutorial https://github.com/manashmndl/DeadSimpleSpeechRecognizer where the data is trained based on samples separated by folders and all mfcc are calculated at once. I am trying to achieve something similar but in a different way. Based on this : https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html librosa can compute mfcc for any audio. as follows : import librosa y, sr = librosa.load('test.wav') mymfcc= librosa.feature.mfcc(y=y, sr =sr) but I want

python librosa package - How can I extract audio from spectrum

阅读更多关于 python librosa package - How can I extract audio from spectrum

问题 In case of vocal separation using Librosa, the vocal and background music can be plotted separately but I want to extract the audio from vocal part and the spectrum of vocal part is located in a variable named 'S_foreground' (please visit the above link for demonstration). How can I get the foreground (vocal) audio? 回答1: You may have noticed that S_foreground comes from S_full which comes from a function called magphase . According to the document about this function, it can Separate a

Unable to use Multithread for librosa melspectrogram

阅读更多关于 Unable to use Multithread for librosa melspectrogram

问题 I have over 1000 audio files (it's just a initial development, in the future, there will be even more audio files), and would like to convert them to melspectrogram. Since my workstation has a Intel® Xeon® Processor E5-2698 v3, which has 32 threads, I would like to use multithread to do my job. My code import os import librosa from librosa.display import specshow from natsort import natsorted import numpy as np import sys # Libraries for multi thread from multiprocessing.dummy import Pool as

librosa.load: file not found error on loading a file

阅读更多关于 librosa.load: file not found error on loading a file

问题 I am trying to use librosa to analyze .wav files. I started with creating a list which stores the names of all the .wav files it detected. data_dir = '/Users/raghav/Desktop/FSU/summer research' audio_file = glob(data_dir + '/*.wav') I can see the names of all the files in the list 'audio_file'. But when I load any of the audio file, it gives me file not found error. audio, sfreq = lr.load(audio_file[0]) error output: Traceback (most recent call last): File "read_audio.py", line 10, in <module

How can you load a spectrogram from file using librosa?

阅读更多关于 How can you load a spectrogram from file using librosa?

问题 At the moment i have a bunch of mp3 files and their features from the dataset here. All of the spectrograms are pre computed so I wanted to know how to load a given spectrogram from file and display it at the very least. Ideally i would like to be able to skip to a point in the spectrogram, with a given time code. 回答1: The generation script for the features is posted here. It states that the features are saved using np.savetxt , which means that you can load them using np.loadtext. Once the

generate mfcc's for audio segments based on annotated file

阅读更多关于 generate mfcc's for audio segments based on annotated file

问题 My main goal is in feeding mfcc features to an ANN. However I am stuck at the data pre processing step and my question has two parts. BACKGROUND : I have an audio. I have a txt file that has the annotation and time stamp like this: 0.0 2.5 Music 2.5 6.05 silence 6.05 8.34 notmusic 8.34 12.0 silence 12.0 15.5 music I know for a single audio file, I can calculate the mfcc using librosa like this : import librosa y, sr = librosa.load('abcd.wav') mfcc=librosa.feature.mfcc(y=y, sr=sr) Part 1: I'm

How to combine mfcc vector with labels from annotation to pass to a neural network

阅读更多关于 How to combine mfcc vector with labels from annotation to pass to a neural network

问题 Using librosa, I created mfcc for my audio file as follows: import librosa y, sr = librosa.load('myfile.wav') print y print sr mfcc=librosa.feature.mfcc(y=y, sr=sr) I also have a text file that contains manual annotations[start, stop, tag] corresponding to the audio as follows: 0.0 2.0 sound1 2.0 4.0 sound2 4.0 6.0 silence 6.0 8.0 sound1 QUESTION: How to do I combine the generated mfcc's that was generated by librosa, with the annotations from text file. Final goal is, I want to combine mfcc