问题
The documentation of scipy.io.wavfile.read says that it returns sample rate and data. But what does data actually mean here in case of .wav files?
Can anyone let me know in layman terms how that data is prepared?
PS. I read somewhere that it means amplitude? Is what I read correct? If yes, how is that amplitude calculated and returned by scipy.io.wavfile.read?
回答1:
scipy.io.wavfile.read is a convenience wrapper to decompose the .wav file into a header and the data contained in the file.
From the source code
Returns
-------
rate : int
Sample rate of wav file.
data : numpy array
Data read from wav file. Data-type is determined from the file;
see Notes.
Simplified code from the source:
fid = open(filename, 'rb')
try:
file_size, is_big_endian = _read_riff_chunk(fid) # find out how to read the file
channels = 1 # assume 1 channel and 8 bit depth if there is no format chunk
bit_depth = 8
while fid.tell() < file_size: #read the file a couple of bytes at a time
# read the next chunk
chunk_id = fid.read(4)
if chunk_id == b'fmt ': # retrieve formatting information
fmt_chunk = _read_fmt_chunk(fid, is_big_endian)
format_tag, channels, fs = fmt_chunk[1:4]
bit_depth = fmt_chunk[6]
if bit_depth not in (8, 16, 32, 64, 96, 128):
raise ValueError("Unsupported bit depth: the wav file "
"has {}-bit data.".format(bit_depth))
elif chunk_id == b'data':
data = _read_data_chunk(fid, format_tag, channels, bit_depth,is_big_endian, mmap)
finally:
if not hasattr(filename, 'read'):
fid.close()
else:
fid.seek(0)
return fs, data
The data itself is usually PCM represented sound pressure levels in successive frames for the different channels. The sampling rate returned by scipy.io.wavfile.read is necessary to determine how many frames represent a second.
A good explanation of the .wav format is offered by this question.
scipy doesn't calculate much on its own.
来源:https://stackoverflow.com/questions/40399930/what-does-the-data-returned-by-scipy-io-wavfile-read-mean