问题
So python is acting like acting like it can't hear ANYTHING from my microphone at all.
Here's the problem. I have a Python ( 2.7 ) script that is suppose to be using Gstreamer to access my microphone and do speech recognition for me via Pocketsphinx. I'm using Pulse Audio and my device is a Raspberry Pi. My microphone is a Playstation 3 Eye.
Now off the bat, I have already gotten pocketsphinx_continuous to run correctly and recognize the words I have defined in my .dict and .lm files. The accuracy is around 85-90% accurate after a couple trial runs I've had. So off the bat I know my microphone is picking up sound normally via pocketsphinx + pulse audio.
FYI I ran the following:
pocketsphinx_continuous -lm /home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm -dict /home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic -hmm /home/pi/dev/scarlettPi/config/speech/model/hmm/en_US/hub4wsj_sc_8k -silprob 0.1 -wip 1e-4 -bestpath 0
In my python code i'm attempting to do the same thing, but i'm using gstreamer to access the microphone in python. ( Note: I'm a bit new to Python )
Here is my code ( Thanks Josip Lisec for getting me this far ):
import pi
from pi.becore import ScarlettConfig
from recorder import Recorder
from brain import Brain
import os
import json
import tempfile
#import sys
import pygtk
pygtk.require('2.0')
import gtk
import gobject
import pygst
pygst.require('0.10')
gobject.threads_init()
import gst
scarlett_config=ScarlettConfig()
class Listener:
def __init__(self, gobject, gst):
self.failed = 0
self.pipeline = gst.parse_launch(' ! '.join(['pulsesrc',
'audioconvert',
'audioresample',
'vader name=vader auto-threshold=true',
'pocketsphinx lm=' + scarlett_config.get('LM') + ' dict=' + scarlett_config.get('DICT') + ' hmm=' + scarlett_config.get('HMM') + ' name=listener',
'fakesink']))
listener = self.pipeline.get_by_name('listener')
listener.connect('result', self.__result__)
listener.set_property('configured', True)
print "KEYWORDS WE'RE LOOKING FOR: " + scarlett_config.get('ourkeywords')
bus = self.pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message::application', self.__application_message__)
self.pipeline.set_state(gst.STATE_PLAYING)
def result(self, hyp, uttid):
if hyp in scarlett_config.get('ourkeywords'):
self.failed = 0
self.listen()
else:
self.failed += 1
if self.failed > 4:
pi.speak("" + scarlett_config.get('scarlett_owner') + ", if you need me, just say my name.")
self.failed = 0
def listen(self):
self.pipeline.set_state(gst.STATE_PAUSED)
pi.play('pi-listening')
Recorder(self)
def cancel_listening(self):
pi.play('pi-cancel')
self.pipeline.set_state(gst.STATE_PLAYING)
# question - sound recording
def answer(self, question):
pi.play('pi-cancel')
print " * Contacting Google"
destf = tempfile.mktemp(suffix='piresult')
os.system('wget --post-file %s --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7" --header="Content-Type: audio/x-flac; rate=16000" -O %s -q "https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US"' % (question, destf))
#os.system("speech2text %s > %s" % (question, destf))
b = open(destf)
result = b.read()
b.close()
os.unlink(question)
os.unlink(destf)
if len(result) == 0:
print " * nop"
pi.play('pi-cancel')
else:
brain = Brain(json.loads(result))
if brain.think() == False:
print " * nop2"
pi.play('pi-cancel')
self.pipeline.set_state(gst.STATE_PLAYING)
def __result__(self, listener, text, uttid):
struct = gst.Structure('result')
struct.set_value('hyp', text)
struct.set_value('uttid', uttid)
listener.post_message(gst.message_new_application(listener, struct))
def __application_message__(self, bus, msg):
msgtype = msg.structure.get_name()
if msgtype == 'result':
self.result(msg.structure['hyp'], msg.structure['uttid'])
The application is suppose to match on the keyword "Scarlett" then perform an action after that.
When I run my application, I get the following output:
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $ ./pi
/usr/lib/python2.7/dist-packages/gtk-2.0/gtk/__init__.py:57: GtkWarning: could not open display
warnings.warn(str(e), _gtk.Warning)
INFO: cmd_ln.c(691): Parsing command line:
gst-pocketsphinx \
-samprate 8000 \
-cmn prior \
-fwdflat no \
-bestpath no \
-maxhmmpf 2000 \
-maxwpf 20
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath no no
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current prior
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes no
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 2000
-maxnewoov 20 20
-maxwpf -1 20
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+03
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.1 1.000000e-01
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 1e-4 1.000000e-04
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-svspec 0-12/13-25/26-38 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-cmninit 56,-3,1 \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 8.000000e+03
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 4120 * 20 bytes (80 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(335): 13 words read
INFO: dict.c(341): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(477): ngrams 1=12, 2=18, 3=17
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(516): 12 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(533): 18 = #bigrams created
INFO: ngram_model_arpa.c(534): 3 = #prob2 entries
INFO: ngram_model_arpa.c(542): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(555): 17 = #trigrams created
INFO: ngram_model_arpa.c(556): 2 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 12 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 12 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 12 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 152
INFO: ngram_search_fwdtree.c(338): after: 12 root, 24 non-root channels, 11 single-phone words
KEYWORDS WE'RE LOOKING FOR: [ 'scarlett', 'SCARLETT' ]
But it fails to match on anything. I almost think python can not hear anything from the microphone, there aren't even any attempts to recognize anything. In pocketsphinx_continuious it usually prints out a READY state when its prepared to start listening...I expect the same in python?
Here are my python packages:
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $ dpkg -l | grep -i python
ii idle 2.7.3-4 all IDE for Python using Tkinter (default version)
ii idle-python2.7 2.7.3-6 all IDE for Python (v2.7) using Tkinter
rc idle3 3.2.3-6 all IDE for Python using Tkinter (default version)
ii libpyside1.1:armhf 1.1.1-3 armhf Python bindings for Qt 4 (base files)
ii libpython2.6 2.6.8-1.1 armhf Shared Python runtime library (version 2.6)
ii libpython2.7 2.7.3-6 armhf Shared Python runtime library (version 2.7)
ii libshiboken1.1:armhf 1.1.1-1 armhf CPython bindings generator for C++ libraries - shared library
ii python 2.7.3-4 all interactive high-level object-oriented language (default version)
ii python-alsaaudio 0.5+svn36-1 armhf Alsa bindings for Python
ii python-cairo 1.8.8-1 armhf Python bindings for the Cairo vector graphics library
ii python-dbg 2.7.3-4 all debug build of the Python Interpreter (version 2.7)
ii python-dbus 1.1.1-1 armhf simple interprocess messaging system (Python interface)
ii python-dbus-dev 1.1.1-1 all main loop integration development files for python-dbus
ii python-dev 2.7.3-4 all header files and a static library for Python (default)
ii python-gi 3.2.2-2 armhf Python 2.x bindings for gobject-introspection libraries
ii python-gi-dbg 3.2.2-2 armhf Python bindings for the GObject library (debug extension)
ii python-gi-dev 3.2.2-2 all development headers for GObject Python bindings
ii python-gobject 3.2.2-2 all Python 2.x bindings for GObject - transitional package
ii python-gobject-2 2.28.6-10 armhf deprecated static Python bindings for the GObject library
ii python-gobject-2-dbg 2.28.6-10 armhf deprecated static Python bindings for the GObject library (debug extension)
ii python-gobject-2-dev 2.28.6-10 all development headers for the static GObject Python bindings
ii python-gobject-dbg 3.2.2-2 all Python 2.x debugging modules for GObject - transitional package
ii python-gobject-dev 3.2.2-2 all Python 2.x development headers for GObject - transitional package
ii python-gst0.10 0.10.22-3 armhf generic media-playing framework (Python bindings)
ii python-gst0.10-dbg 0.10.22-3 armhf generic media-playing framework (Python debug bindings)
ii python-gst0.10-dev 0.10.22-3 armhf generic media-playing framework (Python bindings)
ii python-gst0.10-rtsp 0.10.8-3 armhf GStreamer RTSP server plugin (Python bindings)
ii python-gtk2 2.24.0-3 armhf Python bindings for the GTK+ widget set
ii python-iplib 1.1-3 all Python library to convert amongst many different IPv4 notations
ii python-libxml2 2.8.0+dfsg1-7+nmu1 armhf Python bindings for the GNOME XML library
ii python-minimal 2.7.3-4 all minimal subset of the Python language (default version)
ii python-numpy 1:1.6.2-1.2 armhf Numerical Python adds a fast array facility to the Python language
ii python-pexpect 2.4-1 all Python module for automating interactive applications
ii python-pip 1.1-3 all alternative Python package installer
ii python-pkg-resources 0.6.24-1 all Package Discovery and Resource Access using pkg_resources
ii python-pyalsa 1.0.25-1 armhf Official ALSA Python binding library
ii python-pyside 1.1.1-3 all Python bindings for Qt4 (big metapackage)
ii python-pyside.phonon 1.1.1-3 armhf Qt 4 Phonon module - Python bindings
ii python-pyside.qtcore 1.1.1-3 armhf Qt 4 core module - Python bindings
ii python-pyside.qtdeclarative 1.1.1-3 armhf Qt 4 Declarative module - Python bindings
ii python-pyside.qtgui 1.1.1-3 armhf Qt 4 GUI module - Python bindings
ii python-pyside.qthelp 1.1.1-3 armhf Qt 4 help module - Python bindings
ii python-pyside.qtnetwork 1.1.1-3 armhf Qt 4 network module - Python bindings
ii python-pyside.qtopengl 1.1.1-3 armhf Qt 4 OpenGL module - Python bindings
ii python-pyside.qtscript 1.1.1-3 armhf Qt 4 script module - Python bindings
ii python-pyside.qtsql 1.1.1-3 armhf Qt 4 SQL module - Python bindings
ii python-pyside.qtsvg 1.1.1-3 armhf Qt 4 SVG module - Python bindings
ii python-pyside.qttest 1.1.1-3 armhf Qt 4 test module - Python bindings
ii python-pyside.qtuitools 1.1.1-3 armhf Qt 4 UI tools module - Python bindings
ii python-pyside.qtwebkit 1.1.1-3 armhf Qt 4 WebKit module - Python bindings
ii python-pyside.qtxml 1.1.1-3 armhf Qt 4 XML module - Python bindings
ii python-rpi.gpio 0.5.3a-1 armhf Python GPIO module for Raspberry Pi
ii python-setuptools 0.6.24-1 all Python Distutils Enhancements (setuptools compatibility)
ii python-simplejson 2.5.2-1 armhf simple, fast, extensible JSON encoder/decoder for Python
ii python-support 1.0.15 all automated rebuilding support for Python modules
ii python-tk 2.7.3-1 armhf Tkinter - Writing Tk applications with Python
ii python-yaml 3.10-4 armhf YAML parser and emitter for Python
ii python-yaml-dbg 3.10-4 armhf YAML parser and emitter for Python (debug build)
ii python2.6 2.6.8-1.1 armhf Interactive high-level object-oriented language (version 2.6)
ii python2.6-minimal 2.6.8-1.1 armhf Minimal subset of the Python language (version 2.6)
ii python2.7 2.7.3-6 armhf Interactive high-level object-oriented language (version 2.7)
ii python2.7-dbg 2.7.3-6 armhf Debug Build of the Python Interpreter (version 2.7)
ii python2.7-dev 2.7.3-6 armhf Header files and a static library for Python (v2.7)
ii python2.7-minimal 2.7.3-6 armhf Minimal subset of the Python language (version 2.7)
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $
Also just to confirm that pocketsphinx is complied correctly against the right libaries:
pi@scarlettpi ~ $ ldd /usr/local/bin/pocketsphinx_continuous
/usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0xb6f9b000)
libpocketsphinx.so.1 => /usr/local/lib/libpocketsphinx.so.1 (0xb6f5a000)
libsphinxad.so.0 => /usr/local/lib/libsphinxad.so.0 (0xb6f4e000)
libsphinxbase.so.1 => /usr/local/lib/libsphinxbase.so.1 (0xb6f07000)
libpulse.so.0 => /usr/lib/arm-linux-gnueabihf/libpulse.so.0 (0xb6ea8000)
libpulse-simple.so.0 => /usr/lib/arm-linux-gnueabihf/libpulse-simple.so.0 (0xb6e9c000)
libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0xb6e7d000)
libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xb6e0c000)
libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xb6cdd000)
libjson.so.0 => /lib/arm-linux-gnueabihf/libjson.so.0 (0xb6ccd000)
libpulsecommon-2.0.so => /usr/lib/arm-linux-gnueabihf/pulseaudio/libpulsecommon-2.0.so (0xb6c6b000)
libdbus-1.so.3 => /lib/arm-linux-gnueabihf/libdbus-1.so.3 (0xb6c29000)
libcap.so.2 => /lib/arm-linux-gnueabihf/libcap.so.2 (0xb6c1e000)
librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0xb6c0f000)
libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0xb6c04000)
libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xb6bdb000)
/lib/ld-linux-armhf.so.3 (0xb6fa8000)
libX11-xcb.so.1 => /usr/lib/arm-linux-gnueabihf/libX11-xcb.so.1 (0xb6bd2000)
libX11.so.6 => /usr/lib/arm-linux-gnueabihf/libX11.so.6 (0xb6abe000)
libxcb.so.1 => /usr/lib/arm-linux-gnueabihf/libxcb.so.1 (0xb6a9f000)
libICE.so.6 => /usr/lib/arm-linux-gnueabihf/libICE.so.6 (0xb6a82000)
libSM.so.6 => /usr/lib/arm-linux-gnueabihf/libSM.so.6 (0xb6a73000)
libXtst.so.6 => /usr/lib/arm-linux-gnueabihf/libXtst.so.6 (0xb6a67000)
libwrap.so.0 => /lib/arm-linux-gnueabihf/libwrap.so.0 (0xb6a57000)
libsndfile.so.1 => /usr/lib/arm-linux-gnueabihf/libsndfile.so.1 (0xb69ee000)
libasyncns.so.0 => /usr/lib/arm-linux-gnueabihf/libasyncns.so.0 (0xb69e2000)
libattr.so.1 => /lib/arm-linux-gnueabihf/libattr.so.1 (0xb69d4000)
libXau.so.6 => /usr/lib/arm-linux-gnueabihf/libXau.so.6 (0xb69ca000)
libXdmcp.so.6 => /usr/lib/arm-linux-gnueabihf/libXdmcp.so.6 (0xb69be000)
libuuid.so.1 => /lib/arm-linux-gnueabihf/libuuid.so.1 (0xb69b1000)
libXext.so.6 => /usr/lib/arm-linux-gnueabihf/libXext.so.6 (0xb699b000)
libXi.so.6 => /usr/lib/arm-linux-gnueabihf/libXi.so.6 (0xb6986000)
libnsl.so.1 => /lib/arm-linux-gnueabihf/libnsl.so.1 (0xb696a000)
libFLAC.so.8 => /usr/lib/arm-linux-gnueabihf/libFLAC.so.8 (0xb691f000)
libvorbisenc.so.2 => /usr/lib/arm-linux-gnueabihf/libvorbisenc.so.2 (0xb67b2000)
libvorbis.so.0 => /usr/lib/arm-linux-gnueabihf/libvorbis.so.0 (0xb6782000)
libogg.so.0 => /usr/lib/arm-linux-gnueabihf/libogg.so.0 (0xb6775000)
libresolv.so.2 => /lib/arm-linux-gnueabihf/libresolv.so.2 (0xb6761000)
pi@scarlettpi ~ $
And if you need to see any information about my microphone ( ps3 eye ):
Had to throw this in pastebin, ran out of room in this post.
http://pastebin.com/gSDZwRHc
Does anyone have any ideas why this isn't working? Please let me know if my question needs any clarification or if I can provide any more information to aid with debugging.
Thanks.
回答1:
So I finally got this guy working.
Couple key things I needed to realize:
1. Even if you're using Pulseaudio on your Raspberry Pi, as long as Alsa is still installed you're still able to use it. ( This might seem like a no brainer to others, but I honestly didn't realize I could still use both of these at the same time ) Hint via (syb0rg).
2. When it comes to sending large amounts of raw audio data ( .wav format in my case ) to Pocketsphinx via Gstreamer, (queues) are your friend.
After messing around with gst-launch-0.10 on the command line for a while I came across something that actually worked:
gst-launch-0.10 alsasrc device=hw:1 ! queue ! audioconvert ! audioresample ! queue ! vader name=vader auto-threshold=true ! pocketsphinx lm=/home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm dict=/home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic hmm=/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k name=listener ! fakesink dump=1
So what's happening here?
- Gstreamer is listening to device hw:1 ( Which is my Ps3 Eye USB device ). This device might vary, you can determine this by running :
pi@scarlettpi ~ $ pacmd dump Welcome to PulseAudio! Use "help" for usage information. .... load-module module-alsa-card device_id="0" name="platform-bcm2835_AUD0.0"
card_name="alsa_card.platform-bcm2835_AUD0.0" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
load-module module-udev-detect load-module module-bluetooth-discover load-module module-esound-protocol-unix load-module module-native-protocol-unix load-module module-gconf load-module module-default-device-restore load-module module-rescue-streams load-module module-always-sink load-module module-intended-roles load-module module-console-kit load-module module-systemd-login load-module module-position-event-sounds load-module module-role-cork load-module module-filter-heuristics load-module module-filter-apply load-module module-dbus-protocol load-module module-switch-on-port-available load-module module-cli-protocol-unix load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no
deferred_volume=yes card_properties="module-udev-detect.discovered=1"
....
The important line to notice is:
load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
Thats my Playstation 3 Eye, and thats on device_id=1. Hence hw:1
The audio data coming in from the ps3 eye gets resampled and added to a gstreamer queue and has to pass through a (vader) element before moving on to pocketsphinx. By passing the audio through the vader element w/ the auto-threshold=true flag on, gstreamer can determine the background noise level, which can be important if you have a lousy soundcard or a far-field microphone. This is how the pocketsphinx element will know when an utterance starts and ends.
Add the regular pocketsphix arguments to the pipeline that we already determined (here).
Pass everything into a fakesink since we don't need to hear anything right now, we only need pocketsphinx to listen to everything. The dump=1 flag provides us with more debugging information to see what's being processed / if audio is being accepted at all.
** After getting that to run successfully, the new python code looks like this: **
self.pipeline = gst.parse_launch(' ! '.join(['alsasrc device=' + scarlett_config.gimmie('audio_input_device'),
'queue',
'audioconvert',
'audioresample',
'queue',
'vader name=vader auto-threshold=true',
'pocketsphinx lm=' + scarlett_config.gimmie('LM') + ' dict=' + scarlett_config.gimmie('DICT') + ' hmm=' + scarlett_config.gimmie('HMM') + ' name=listener',
'fakesink dump=1']))
Hope this helps someone.
NOTE: Please excuse me if my Gstreamer pipline is using excessive elements. I'm fairly new to Gstreamer, and i'm opener to more efficient ways of doing this.
来源:https://stackoverflow.com/questions/18087720/python-having-trouble-accessing-usb-microphone-using-gstreamer-to-perform-speech