Isolating audio foreground and converting back to audio stream using librosa

时间秒杀一切 提交于 2020-02-06 09:07:57

问题


I'm trying to isolate the foreground of an audio stream and then save it as a standalone audio stream using librosa.

Starting with this seemingly relevant example.

I have the full, foreground and background data isolated as the example does in S_full, S_foreground and S_background but I'm unsure as to what to do to use those as audio.

I attempted to use librosa.istft(...) to convert those and then save that as a .wav file using soundfile.write(...) but I'm left with a file of roughly the right size but unusable(?) data.

Can anyone describe or point me at an example?

Thanks.


回答1:


in putting together the minimal example, istft() with the original sampling rate does in fact work.

I'll find my bug, somewhere. FWIW here's the working code

import numpy as np
import librosa
from librosa import display
import soundfile
import matplotlib.pyplot as plt

y, sr = librosa.load('audio/rb-testspeech.mp3', duration=5)
S_full, phase = librosa.magphase(librosa.stft(y))

S_filter = librosa.decompose.nn_filter(S_full,
                                       aggregate=np.median,
                                       metric='cosine',
                                       width=int(librosa.time_to_frames(2, sr=sr)))
S_filter = np.minimum(S_full, S_filter)

margin_i, margin_v = 2, 10
power = 2

mask_v = librosa.util.softmask(S_full - S_filter,
                               margin_v * S_filter,
                               power=power)

S_foreground = mask_v * S_full

full = librosa.amplitude_to_db(S_full, ref=np.max)
librosa.display.specshow(full, y_axis='log', sr=sr)

plt.title('Full spectrum')
plt.colorbar()

plt.tight_layout()
plt.show()

print("y({}): {}".format(len(y),y))
print("sr: {}".format(sr))

full_audio = librosa.istft(S_full)
foreground_audio = librosa.istft(S_foreground)
print("full({}): {}".format(len(full_audio), full_audio))

soundfile.write('orig.WAV', y, sr) 
soundfile.write('full.WAV', full_audio, sr) 
soundfile.write('foreground.WAV', foreground_audio, sr) 


来源:https://stackoverflow.com/questions/59059196/isolating-audio-foreground-and-converting-back-to-audio-stream-using-librosa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!