librosa

How to read audio file from google cloud storage bucket and play with ipd in a datalab notebook

旧巷老猫 提交于 2020-01-04 13:44:32
问题 I want to play a sound file in a datalab notebook which I read from a google cloud storage bucket. How to do this? 回答1: import numpy as np import IPython.display as ipd import librosa import soundfile as sf import io from google.cloud import storage BUCKET = 'some-bucket' # Create a Cloud Storage client. gcs = storage.Client() # Get the bucket that the file will be uploaded to. bucket = gcs.get_bucket(BUCKET) # specify a filename file_name = 'some_dir/some_audio.wav' # read a blob blob =

How to use a context window to segment a whole log Mel-spectrogram (ensuring the same number of segments for all the audios)?

筅森魡賤 提交于 2020-01-03 01:55:14
问题 I have several audios with different duration. So I don't know how to ensure the same number N of segments of the audio. I'm trying to implement an existing paper, so it's said that first a Log Mel-Spectrogram is performed in the whole audio with 64 Mel-filter banks from 20 to 8000 Hz, by using a 25 ms Hamming window and a 10 ms overlapping. Then, in order to get that I have the following code lines: y, sr = librosa.load(audio_file, sr=None) #sr = 22050 #len(y) = 237142 #duration = 5

Why is image stored different than the one imshowed?

丶灬走出姿态 提交于 2019-12-25 09:45:30
问题 I am currently not able to understand why I am not able to recreate the plot after I store the data.. import os import sys from os import listdir from os.path import isfile, join import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import seaborn as sb from matplotlib.colors import Normalize import matplotlib from matplotlib import cm from PIL import Image import librosa import librosa.display import ast def make_plot_store_data(name,interweaved): librosa

compute mfcc for varying time intervals based on time stamps

巧了我就是萌 提交于 2019-12-23 05:08:13
问题 I came across this nice tutorial https://github.com/manashmndl/DeadSimpleSpeechRecognizer where the data is trained based on samples separated by folders and all mfcc are calculated at once. I am trying to achieve something similar but in a different way. Based on this : https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html librosa can compute mfcc for any audio. as follows : import librosa y, sr = librosa.load('test.wav') mymfcc= librosa.feature.mfcc(y=y, sr =sr) but I want

python librosa package - How can I extract audio from spectrum

北战南征 提交于 2019-12-22 09:07:50
问题 In case of vocal separation using Librosa, the vocal and background music can be plotted separately but I want to extract the audio from vocal part and the spectrum of vocal part is located in a variable named 'S_foreground' (please visit the above link for demonstration). How can I get the foreground (vocal) audio? 回答1: You may have noticed that S_foreground comes from S_full which comes from a function called magphase . According to the document about this function, it can Separate a

Unable to use Multithread for librosa melspectrogram

家住魔仙堡 提交于 2019-12-22 00:31:00
问题 I have over 1000 audio files (it's just a initial development, in the future, there will be even more audio files), and would like to convert them to melspectrogram. Since my workstation has a Intel® Xeon® Processor E5-2698 v3, which has 32 threads, I would like to use multithread to do my job. My code import os import librosa from librosa.display import specshow from natsort import natsorted import numpy as np import sys # Libraries for multi thread from multiprocessing.dummy import Pool as

librosa.load: file not found error on loading a file

試著忘記壹切 提交于 2019-12-18 09:42:00
问题 I am trying to use librosa to analyze .wav files. I started with creating a list which stores the names of all the .wav files it detected. data_dir = '/Users/raghav/Desktop/FSU/summer research' audio_file = glob(data_dir + '/*.wav') I can see the names of all the files in the list 'audio_file'. But when I load any of the audio file, it gives me file not found error. audio, sfreq = lr.load(audio_file[0]) error output: Traceback (most recent call last): File "read_audio.py", line 10, in <module

How can you load a spectrogram from file using librosa?

纵然是瞬间 提交于 2019-12-17 21:24:00
问题 At the moment i have a bunch of mp3 files and their features from the dataset here. All of the spectrograms are pre computed so I wanted to know how to load a given spectrogram from file and display it at the very least. Ideally i would like to be able to skip to a point in the spectrogram, with a given time code. 回答1: The generation script for the features is posted here. It states that the features are saved using np.savetxt , which means that you can load them using np.loadtext. Once the

generate mfcc's for audio segments based on annotated file

こ雲淡風輕ζ 提交于 2019-12-13 12:36:33
问题 My main goal is in feeding mfcc features to an ANN. However I am stuck at the data pre processing step and my question has two parts. BACKGROUND : I have an audio. I have a txt file that has the annotation and time stamp like this: 0.0 2.5 Music 2.5 6.05 silence 6.05 8.34 notmusic 8.34 12.0 silence 12.0 15.5 music I know for a single audio file, I can calculate the mfcc using librosa like this : import librosa y, sr = librosa.load('abcd.wav') mfcc=librosa.feature.mfcc(y=y, sr=sr) Part 1: I'm

How to combine mfcc vector with labels from annotation to pass to a neural network

人走茶凉 提交于 2019-12-12 08:58:06
问题 Using librosa, I created mfcc for my audio file as follows: import librosa y, sr = librosa.load('myfile.wav') print y print sr mfcc=librosa.feature.mfcc(y=y, sr=sr) I also have a text file that contains manual annotations[start, stop, tag] corresponding to the audio as follows: 0.0 2.0 sound1 2.0 4.0 sound2 4.0 6.0 silence 6.0 8.0 sound1 QUESTION: How to do I combine the generated mfcc's that was generated by librosa, with the annotations from text file. Final goal is, I want to combine mfcc