How can I create a waveform image of an MP3 in Linux?

前端 未结 7 1102
旧时难觅i
旧时难觅i 2021-02-01 21:03

Given an MP3 I would like to extract the waveform from the file into an image (.png)

Is there a package that can do what I need ?

相关标签:
7条回答
  • 2021-02-01 21:33

    I would do something like this :

    • find a tool to convert mp3 to PCM, ie binary data with one 8 or 16 bit value per sample. I guess mplayer can do that

    • pipe the result to a utility converting binary data to an ascii representation of the numbers in decimal format

    • use gnuplot to transform this list of value into a png graph.

    And voilà, the power of piping between unix tools. Now Step 2 in this list might be optionnal if gnuplot is able to read it's data from a binary format.

    0 讨论(0)
  • 2021-02-01 21:42

    You might want to consider audiowaveform from the BBC.

    audiowaveform is a C++ command-line application that generates waveform data from either MP3, WAV, or FLAC format audio files. Waveform data can be used to produce a visual rendering of the audio, similar in appearance to audio editing applications.

    Waveform data files are saved in either binary format (.dat) or JSON (.json). Given an input waveform data file, audiowaveform can also render the audio waveform as a PNG image at a given time offset and zoom level.

    The waveform data is produced from an input stereo audio signal by first combining the left and right channels to produce a mono signal. The next stage is to compute the minimum and maximum sample values over groups of N input samples (where N is controlled by the --zoom command-line option), such that each N input samples produces one pair of minimum and maxmimum points in the output.

    https://github.com/bbcrd/audiowaveform

    0 讨论(0)
  • 2021-02-01 21:42

    This is a standard function in SoX (command line tool for sound, Windows & Linux) Check the 'spectrogram' function on http://sox.sourceforge.net/sox.html

    "The spectrogram is rendered in a Portable Network Graphic (PNG) file, and shows time in the X-axis, frequency in the Y-axis, and audio signal magnitude in the Z-axis. Z-axis values are represented by the colour (or optionally the intensity) of the pixels in the X-Y plane. If the audio signal contains multiple channels then these are shown from top to bottom starting from channel 1 (which is the left channel for stereo audio)."

    0 讨论(0)
  • 2021-02-01 21:44

    Building on the answer of qubodup

    # install stuff
    apt install gnuplot
    apt install sox
    apt install libsox-fmt-mp3
    
    #create plaintext file of amplitude values
    sox sound.mp3 sound.dat
    
    # run script saved on audio.gpi file
    gnuplot audio.gpi
    

    You can also comment the "set output ..." line in the configuration file and do

    gnuplot audio.gpi > my_sound.png
    

    The configuration file is audio.gpi in this case and inside it has

    #!/usr/bin/env gnuplot
    
    set datafile commentschars ";"
    
    set terminal png #size 800,400
    set output "sound.png"
    
    unset border
    unset xtics
    unset ytics
    
    set key off
    
    plot "sound.dat" with lines
    

    Which produces images like the following

    I wanted no axis, no legend, png (much smaller than svg).

    0 讨论(0)
  • 2021-02-01 21:49

    Using sox and gnuplot you can create basic waveform images:

    sox audio.mp3 audio.dat #create plaintext file of amplitude values
    tail -n+3 audio.dat > audio_only.dat #remove comments
    
    # write script file for gnuplot
    echo set term png size 320,180 > audio.gpi #set output format
    echo set output \"audio.png\" >> audio.gpi #set output file
    echo plot \"audio_only.dat\" with lines >> audio.gpi #plot data
    
    gnuplot audio.gpi #run script
    

    enter image description here

    To create something simpler/prettier, use the following GNU Plot file as a template (save it as audio.gpi):

    #set output format and size
    set term png size 320,180
    
    #set output file
    set output "audio.png"
    
    # set y range
    set yr [-1:1]
    
    # we want just the data
    unset key
    unset tics
    unset border
    set lmargin 0             
    set rmargin 0
    set tmargin 0
    set bmargin 0
    
    # draw rectangle to change background color
    set obj 1 rectangle behind from screen 0,0 to screen 1,1
    set obj 1 fillstyle solid 1.0 fillcolor rgbcolor "#222222"
    
    # draw data with foreground color
    plot "audio_only.dat" with lines lt rgb 'white'
    

    and just run:

    sox audio.mp3 audio.dat #create plaintext file of amplitude values
    tail -n+3 audio.dat > audio_only.dat #remove comments
    
    gnuplot audio.gpi #run script
    

    enter image description here

    Based on this answer to a similar question that is more general regarding file format but less general in regards to software used.

    0 讨论(0)
  • 2021-02-01 21:50

    If you have a GUI environment you can use the audacity audio editor to load the mp3 and then use the print command to generate a pdf of the waveform. Then convert the pdf to png.

    0 讨论(0)
提交回复
热议问题