Reading *.wav files in Python

问题

I need to analyze sound written in a .wav file. For that I need to transform this file into set of numbers (arrays, for example). I think I need to use the wave package. However, I do not know how exactly it works. For example I did the following:

import wave
w = wave.open('/usr/share/sounds/ekiga/voicemail.wav', 'r')
for i in range(w.getnframes()):
    frame = w.readframes(i)
    print frame

As a result of this code I expected to see sound pressure as function of time. In contrast I see a lot of strange, mysterious symbols (which are not hexadecimal numbers). Can anybody, pleas, help me with that?

回答1:

Per the sources, scipy.io.wavfile.read(somefile) returns a tuple of two items: the first is the sampling rate in samples per second, the second is a numpy array with all the data read from the file. Looks pretty easy to use!

e.g:

from scipy.io import wavfile
fs, data = wavfile.read('./output/audio.wav')

回答2:

I did some research this evening and figured this out:

import wave, struct

waveFile = wave.open('sine.wav', 'r')

length = waveFile.getnframes()
for i in range(0,length):
    waveData = waveFile.readframes(1)
    data = struct.unpack("<h", waveData)
    print(int(data[0]))

Hopefully this snippet helps someone. Details: using the struct module, you can take the wave frames (which are in 2s complementary binary between -32768; 0x8000 and 32767; 0x7FFF) This reads a MONO, 16-BIT, WAVE file. I found this webpage quite useful in formulating this.

This snippet reads 1 frame. To read more than one frame (e.g., 13), use

waveData = waveFile.readframes(13)
data = struct.unpack("<13h", waveData)

回答3:

Different python modules to read wav:

There is at least these following libraries to read wave audio file:

PySoundFile
scipy.io.wavfile (from scipy)
wave (to read streams. Included in python 2 and 3)
scikits.audiolab (that seems unmaintained)
sounddevice (play and record sounds, good for streams and real-time)
pyglet

The most simple example:

This is a simple example with Pysoundfile:

import soundfile as sf
data, samplerate = sf.read('existing_file.wav')

Format of the output:

Warning, the data are not always in the same format, that depends on the library. For instance:

from scikits import audiolab
from scipy.io import wavfile
from sys import argv
for filetest in argv[1:]:
    [x, fs, nbBits] = audiolab.wavread(filePath)
    print '\nReading with scikits.audiolab.wavread: ', x
    [fs, x] = wavfile.read(filetest)
    print '\nReading with scipy.io.wavfile.read: ', x

Reading with scikits.audiolab.wavread: [ 0. 0. 0. ..., -0.00097656 -0.00079346 -0.00097656] Reading with scipy.io.wavfile.read: [ 0 0 0 ..., -32 -26 -32]

PySoundFile and Audiolab return float between -1 and 1 (as matab does, that is the convention for audio signal). Scipy and wave return integers, that can be converted in float according to the number of bit of encoding.

For example:

from scipy.io.wavfile import read as wavread
[samplerate, x] = wavread(audiofilename) # x is a numpy array of integer, representing the samples 
# scale to -1.0 -- 1.0
if x.dtype == 'int16':
    nb_bits = 16 # -> 16-bit wav files
elif x.dtype == 'int32':
    nb_bits = 32 # -> 32-bit wav files
max_nb_bit = float(2 ** (nb_bits - 1))
samples = x / (max_nb_bit + 1.0) # samples is a numpy array of float representing the samples

回答4:

IMHO, the easiest way to get audio data from a sound file into a NumPy array is PySoundFile:

import soundfile as sf
data, fs = sf.read('/usr/share/sounds/ekiga/voicemail.wav')

This also supports 24-bit files out of the box.

There are many sound file libraries available, I've written an overview where you can see a few pros and cons. It also features a page explaining how to read a 24-bit wav file with the wave module.

回答5:

You can accomplish this using the scikits.audiolab module. It requires NumPy and SciPy to function, and also libsndfile.

Note, I was only able to get it to work on Ubunutu and not on OSX.

from scikits.audiolab import wavread

filename = "testfile.wav"

data, sample_frequency,encoding = wavread(filename)

Now you have the wav data

回答6:

If you want to procces an audio block by block, some of the given solutions are quite awful in the sense that they imply loading the whole audio into memory producing many cache misses and slowing down your program. python-wavefile provides some pythonic constructs to do NumPy block-by-block processing using efficient and transparent block management by means of generators. Other pythonic niceties are context manager for files, metadata as properties... and if you want the whole file interface, because you are developing a quick prototype and you don't care about efficency, the whole file interface is still there.

A simple example of processing would be:

import sys
from wavefile import WaveReader, WaveWriter

with WaveReader(sys.argv[1]) as r :
    with WaveWriter(
            'output.wav',
            channels=r.channels,
            samplerate=r.samplerate,
            ) as w :

        # Just to set the metadata
        w.metadata.title = r.metadata.title + " II"
        w.metadata.artist = r.metadata.artist

        # This is the prodessing loop
        for data in r.read_iter(size=512) :
            data[1] *= .8     # lower volume on the second channel
            w.write(data)

The example reuses the same block to read the whole file, even in the case of the last block that usually is less than the required size. In this case you get an slice of the block. So trust the returned block length instead of using a hardcoded 512 size for any further processing.

回答7:

If you're going to perform transfers on the waveform data then perhaps you should use SciPy, specifically scipy.io.wavfile.

回答8:

I needed to read a 1-channel 24-bit WAV file. The post above by Nak was very useful. However, as mentioned above by basj 24-bit is not straightforward. I finally got it working using the following snippet:

from scipy.io import wavfile
TheFile = 'example24bit1channelFile.wav'
[fs, x] = wavfile.read(TheFile)

# convert the loaded data into a 24bit signal

nx = len(x)
ny = nx/3*4    # four 3-byte samples are contained in three int32 words

y = np.zeros((ny,), dtype=np.int32)    # initialise array

# build the data left aligned in order to keep the sign bit operational.
# result will be factor 256 too high

y[0:ny:4] = ((x[0:nx:3] & 0x000000FF) << 8) | \
  ((x[0:nx:3] & 0x0000FF00) << 8) | ((x[0:nx:3] & 0x00FF0000) << 8)
y[1:ny:4] = ((x[0:nx:3] & 0xFF000000) >> 16) | \
  ((x[1:nx:3] & 0x000000FF) << 16) | ((x[1:nx:3] & 0x0000FF00) << 16)
y[2:ny:4] = ((x[1:nx:3] & 0x00FF0000) >> 8) | \
  ((x[1:nx:3] & 0xFF000000) >> 8) | ((x[2:nx:3] & 0x000000FF) << 24)
y[3:ny:4] = (x[2:nx:3] & 0x0000FF00) | \
  (x[2:nx:3] & 0x00FF0000) | (x[2:nx:3] & 0xFF000000)

y = y/256   # correct for building 24 bit data left aligned in 32bit words

Some additional scaling is required if you need results between -1 and +1. Maybe some of you out there might find this useful

回答9:

if its just two files and the sample rate is significantly high, you could just interleave them.

from scipy.io import wavfile
rate1,dat1 = wavfile.read(File1)
rate2,dat2 = wavfile.read(File2)

if len(dat2) > len(dat1):#swap shortest
    temp = dat2
    dat2 = dat1
    dat1 = temp

output = dat1
for i in range(len(dat2)/2): output[i*2]=dat2[i*2]

wavfile.write(OUTPUT,rate,dat)

回答10:

u can also use simple import wavio library u also need have some basic knowledge of the sound.

来源：https://stackoverflow.com/questions/2060628/reading-wav-files-in-python

标签

python

audio

wav

wave