Can I convert spectrograms generated with librosa back to audio?

前端 未结 1 1337
旧巷少年郎
旧巷少年郎 2021-01-19 05:02

I converted some audio files to spectrograms and saved them to files using the following code:

import os
from matplotlib import pyplot as plt
import librosa
         


        
相关标签:
1条回答
  • 2021-01-19 05:24

    Yes, it is possible to recover most of the signal and estimate the phase with e.g. Griffin-Lim Algorithm (GLA). Its "fast" implementation for Python can be found in librosa. Here's how you can use it:

    import numpy as np
    import librosa
    
    y, sr = librosa.load(librosa.util.example_audio_file(), duration=10)
    S = np.abs(librosa.stft(y))
    y_inv = librosa.griffinlim(S)
    

    And that's how the original and reconstruction look like:

    The algorithm by default randomly initialises the phases and then iterates forward and inverse STFT operations to estimate the phases.

    Looking at your code, to reconstruct the signal, you'd just need to do:

    import numpy as np
    
    X_inv = librosa.griffinlim(np.abs(X))
    

    It's just an example of course. As pointed out by @PaulR, in your case you'd need to load the data from jpeg (which is lossy!) and then apply inverse transform to amplitude_to_db first.

    The algorithm, especially the phase estimation, can be further improved thanks to advances in artificial neural networks. Here is one paper that discusses some enhancements.

    0 讨论(0)
提交回复
热议问题