问题
I have been trying to do real-time audio signal processing using 'pyAudio' module in python. What I did was a simple case of reading audio data from microphone and play it via headphones. I tried with the following code(both Python and Cython versions). Thought it works but unfortunately it is stalls and not smooth enough. How can I improve the code so that it will run smoothly. My PC is i7, 8GB RAM.
Python Version
import pyaudio
import numpy as np
RATE = 16000
CHUNK = 256
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True,
frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
Cython Version
import pyaudio
import numpy as np
cdef int RATE = 16000
cdef int CHUNK = 1024
cdef int i
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(500): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
回答1:
I believe you are missing CHUNK
as second argument to player.write
call.
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)
Also, not sure if its formatting error. But player.write
needs to be tabbed into for
loop
And per pyaudio site you need to have RATE / CHUNK * RECORD_SECONDS
and not RECORD *RATE/CHUNK
as python
executes *
multiplication before /
division.
for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
Finally, you may want to increase rate
to 44100
, CHUNK
to 1024
and CHANNEL
to 2
for better fidelity.
回答2:
The code below will take the default input device, and output what's recorded into the default output device.
import PyAudio
import numpy as np
p = pyaudio.PyAudio()
CHANNELS = 2
RATE = 44100
def callback(in_data, frame_count, time_info, flag):
# using Numpy to convert to array for processing
# audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data, pyaudio.paContinue
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
output=True,
input=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(20)
stream.stop_stream()
print("Stream is stopped")
stream.close()
p.terminate()
This will run for 20 seconds and stop. The method callback is where you can process the signal :
audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data
is where you send back post-processed data to the output device.
Note chunk has a default argument of 1024 as noted in the PyAudio docs: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open
回答3:
I am working on a similar project. I modified your code and the stalls now are gone. The bigger the chunk the bigger the delay. That is why I kept it low.
import pyaudio
import numpy as np
CHUNK = 2**5
RATE = 44100
LEN = 10
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
for i in range(int(LEN*RATE/CHUNK)): #go for a LEN seconds
data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
player.write(data,CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
来源:https://stackoverflow.com/questions/46386011/real-time-audio-signal-processing-using-python