speech recognition - RaspberryPi + Pocketsphinx + ps3eye Error: Failed to open audio device -
just installed pocketsphinx on raspberry pi. think i'm going crazy not sure if i'm providing correct device.
whenever run:
src/programs/pocketsphinx_continuous -adcdev plughw:1,0 -nfft 2048 -samprate 48000
i following:
root@scarlettpi:/usr/install/pocketsphinx-0.8# src/programs/pocketsphinx_continuous -adcdev plughw:1,0 -nfft 2048 -samprate 48000 info: cmd_ln.c(691): parsing command line: /usr/install/pocketsphinx-0.8/src/programs/.libs/lt-pocketsphinx_continuous \ -adcdev plughw:1,0 \ -nfft 2048 \ -samprate 48000
current configuration: [name] [deflt] [value] -adcdev plughw:1,0 -agc none none -agcthresh 2.0 2.000000e+00 -alpha 0.97 9.700000e-01 -argfile
-ascale 20.0 2.000000e+01 -aw 1 1 -backtrace no no -beam 1e-48 1.000000e-48 -bestpath yes yes -bestpathlw 9.5 9.500000e+00 -bghist no no -ceplen 13 13 -cmn current current -cmninit 8.0 8.0 -compallsen no no -debug 0 -dict
-dictcase no no -dither no no -doublebw no no -ds 1 1 -fdict
-feat 1s_c_d_dd 1s_c_d_dd -featparams
-fillprob 1e-8 1.000000e-08 -frate 100 100 -fsg
-fsgusealtpron yes yes -fsgusefiller yes yes -fwdflat yes yes -fwdflatbeam 1e-64 1.000000e-64 -fwdflatefwid 4 4 -fwdflatlw 8.5 8.500000e+00 -fwdflatsfwin 25 25 -fwdflatwbeam 7e-29 7.000000e-29 -fwdtree yes yes -hmm
-infile
-input_endian little little -jsgf
-kdmaxbbi -1 -1 -kdmaxdepth 0 0 -kdtree
-latsize 5000 5000 -lda
-ldadim 0 0 -lextreedump 0 0 -lifter 0 0 -lm
-lmctl
-lmname default default -logbase 1.0001 1.000100e+00 -logfn
-logspec no no -lowerf 133.33334 1.333333e+02 -lpbeam 1e-40 1.000000e-40 -lponlybeam 7e-29 7.000000e-29 -lw 6.5 6.500000e+00 -maxhmmpf -1 -1 -maxnewoov 20 20 -maxwpf -1 -1 -mdef
-mean
-mfclogdir
-min_endfr 0 0 -mixw
-mixwfloor 0.0000001 1.000000e-07 -mllr
-mmap yes yes -ncep 13 13 -nfft 512 2048 -nfilt 40 40 -nwpen 1.0 1.000000e+00 -pbeam 1e-48 1.000000e-48 -pip 1.0 1.000000e+00 -pl_beam 1e-10 1.000000e-10 -pl_pbeam 1e-5 1.000000e-05 -pl_window 0 0 -rawlogdir
-remove_dc no no -round_filters yes yes -samprate 16000 4.800000e+04 -seed -1 -1 -sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03 -smoothspec no no -svspec
-time no no -tmat
-tmatfloor 0.0001 1.000000e-04 -topn 4 4 -topn_beam 0 0 -toprule
-transform legacy legacy -unit_area yes yes -upperf 6855.4976 6.855498e+03 -usewdphones no no -uw 1.0 1.000000e+00 -var
-varfloor 0.0001 1.000000e-04 -varnorm no no -verbose no no -warp_params
-warp_type inverse_linear inverse_linear -wbeam 7e-29 7.000000e-29 -wip 0.65 6.500000e-01 -wlen 0.025625 2.562500e-02info: cmd_ln.c(691): parsing command line: \ -nfilt 20 \ -lowerf 1 \ -upperf 4000 \ -wlen 0.025 \ -transform dct \ -round_filters no \ -remove_dc yes \ -svspec 0-12/13-25/26-38 \ -feat 1s_c_d_dd \ -agc none \ -cmn current \ -cmninit 56,-3,1 \ -varnorm no
current configuration: [name] [deflt] [value] -agc none none -agcthresh 2.0 2.000000e+00 -alpha 0.97 9.700000e-01 -ceplen 13 13 -cmn current current -cmninit 8.0 56,-3,1 -dither no no -doublebw no no -feat 1s_c_d_dd 1s_c_d_dd -frate 100 100 -input_endian little little -lda
-ldadim 0 0 -lifter 0 0 -logspec no no -lowerf 133.33334 1.000000e+00 -ncep 13 13 -nfft 512 2048 -nfilt 40 20 -remove_dc no yes -round_filters yes no -samprate 16000 4.800000e+04 -seed -1 -1 -smoothspec no no -svspec 0-12/13-25/26-38 -transform legacy dct -unit_area yes yes -upperf 6855.4976 4.000000e+03 -varnorm no no -verbose no no -warp_params
-warp_type inverse_linear inverse_linear -wlen 0.025625 2.500000e-02info: acmod.c(246): parsed model-specific feature parameters /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/feat.params info: feat.c(713): initializing feature stream type: '1s_c_d_dd', ceplen=13, cmn='current', varnorm='no', agc='none' info: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0 info: acmod.c(167): using subvector specification 0-12/13-25/26-38 info: mdef.c(517): reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/mdef info: mdef.c(528): found byte-order mark bmdf, assuming binary mdef file info: bin_mdef.c(336): reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/mdef info: bin_mdef.c(513): 50 ci-phone, 143047 cd-phone, 3 emitstate/phone, 150 ci-sen, 5150 sen, 27135 sen-seq info: tmat.c(205): reading hmm transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/transition_matrices info: acmod.c(121): attempting use schmm computation module info: ms_gauden.c(198): reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/means info: ms_gauden.c(292): 1 codebook, 3 feature, size: info: ms_gauden.c(294): 256x13 info: ms_gauden.c(294): 256x13 info: ms_gauden.c(294): 256x13 info: ms_gauden.c(198): reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/variances info: ms_gauden.c(292): 1 codebook, 3 feature, size: info: ms_gauden.c(294): 256x13 info: ms_gauden.c(294): 256x13 info: ms_gauden.c(294): 256x13 info: ms_gauden.c(354): 0 variance values floored info: s2_semi_mgau.c(903): loading senones dump file /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/sendump info: s2_semi_mgau.c(927): begin file format description info: s2_semi_mgau.c(1022): using memory-mapped i/o senones info: s2_semi_mgau.c(1296): maximum top-n: 4 top-n beams: 0 0 0 info: dict.c(317): allocating 137543 * 20 bytes (2686 kib) word entries info: dict.c(332): reading main dictionary: /usr/local/share/pocketsphinx/model/lm/en_us/cmu07a.dic info: dict.c(211): allocated 1010 kib strings, 1664 kib phones info: dict.c(335): 133436 words read info: dict.c(341): reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_us/hub4wsj_sc_8k/noisedict info: dict.c(211): allocated 0 kib strings, 0 kib phones info: dict.c(344): 11 words read info: dict2pid.c(396): building pid tables dictionary info: dict2pid.c(404): allocating 50^3 * 2 bytes (244 kib) word-initial triphones info: dict2pid.c(131): allocated 30200 bytes (29 kib) word-final triphones info: dict2pid.c(195): allocated 30200 bytes (29 kib) single-phone word triphones info: ngram_model_arpa.c(77): no \data\ mark in lm file info: ngram_model_dmp.c(142): use memory-mapped i/o lm file info: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286 info: ngram_model_dmp.c(242): 5001 = lm.unigrams(+trailer) read info: ngram_model_dmp.c(288): 436879 = lm.bigrams(+trailer) read info: ngram_model_dmp.c(314): 418286 = lm.trigrams read info: ngram_model_dmp.c(339): 37293 = lm.prob2 entries read info: ngram_model_dmp.c(359): 14370 = lm.bo_wt2 entries read info: ngram_model_dmp.c(379): 36094 = lm.prob3 entries read info: ngram_model_dmp.c(407): 854 = lm.tseg_base entries read info: ngram_model_dmp.c(463): 5001 = ascii word strings read info: ngram_search_fwdtree.c(99): 788 unique initial diphones info: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words info: ngram_search_fwdtree.c(186): creating search tree info: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words info: ngram_search_fwdtree.c(326): after: max nonroot chan increased 13428 info: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words info: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25 info: continuous.c(371): /usr/install/pocketsphinx-0.8/src/programs/.libs/lt-pocketsphinx_continuous compiled on: jul 21 2013, at: 14:34:06
mixer load failed: invalid argument fatal_error: "continuous.c", line 246: failed open audio device
i'm using ps3eye currently. if simple:
arecord -d plughw:1,0 -d 5 -q -f cd -t wav ~/test.wav
everything works fine ( verified hooking raspberrypi tv via hdmi , running aplay ~/test.wav )
what doing wrong guys?
information might need ( based on other posts i've seen ):
root@scarlettpi:/usr/install/pocketsphinx-0.8# aplay -l **** list of playback hardware devices **** card 0: alsa [bcm2835 alsa], device 0: bcm2835 alsa [bcm2835 alsa] subdevices: 8/8 subdevice #0: subdevice #0 subdevice #1: subdevice #1 subdevice #2: subdevice #2 subdevice #3: subdevice #3 subdevice #4: subdevice #4 subdevice #5: subdevice #5 subdevice #6: subdevice #6 subdevice #7: subdevice #7 root@scarlettpi:/usr/install/pocketsphinx-0.8# root@scarlettpi:/usr/install/pocketsphinx-0.8# aplay -l null discard samples (playback) or generate 0 samples (capture) pulse pulseaudio sound server sysdefault:card=alsa bcm2835 alsa, bcm2835 alsa default audio device root@scarlettpi:/usr/install/pocketsphinx-0.8# root@scarlettpi:/usr/install/pocketsphinx-0.8# dpkg -l | grep "alsa" ii alsa-base 1.0.25+3~deb7u1 alsa driver configuration files ii alsa-firmware-loaders 1.0.25-2 armhf alsa software loaders specific hardware ii alsa-oss 1.0.25-1 armhf alsa wrapper oss applications ii alsa-tools 1.0.25-2 armhf console based alsa utilities specific hardware ii alsa-utils 1.0.25-4 armhf utilities configuring , using alsa ii alsaplayer-alsa 0.99.80-5.1 armhf pcm player designed alsa (alsa output module) ii alsaplayer-common 0.99.80-5.1 armhf pcm player designed alsa (common files) ii alsaplayer-gtk 0.99.80-5.1 armhf pcm player designed alsa (gtk+ version) ii gstreamer0.10-alsa:armhf 0.10.36-1.1 armhf gstreamer plugin alsa ii libsox-fmt-alsa 14.4.0-3 armhf sox alsa format i/o library root@scarlettpi:/usr/install/pocketsphinx-0.8# root@scarlettpi:/usr/install/pocketsphinx-0.8# dpkg -l | grep pulseaudio ii gstreamer0.10-pulseaudio:armhf 0.10.31-3+nmu1 armhf gstreamer plugin pulseaudio root@scarlettpi:/usr/install/pocketsphinx-0.8#
also in terms of installing pocket sphinx did following:
# uninstall pulse audio if installed apt-get remove pulseaudio -y aptitude purge pulseaudio -y # sphinxbase install apt-get install bison -y cd /usr/install wget http://downloads.sourceforge.net/project/cmusphinx/sphinxbase/0.8/sphinxbase-0.8.tar.gz tar -xvf sphinxbase-0.8.tar.gz cd sphinxbase-0.8 ./configure make make install cd - # pocketsphinx installwget http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/pocketsphinx-0.8.tar.gz tar -xvf pocketsphinx-0.8.tar.gz cd pocketsphinx-0.8 ./configure make make install
any ideas or advice in right direction extremely helpful.
thanks,
malcolm jones
edit:
forgot include information well:
root@scarlettpi:/usr/install/pocketsphinx-0.8# arecord -l null discard samples (playback) or generate 0 samples (capture) pulse pulseaudio sound server sysdefault:card=camerab409241 usb camera-b4.09.24.1, usb audio default audio device front:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio front speakers surround40:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio 4.0 surround output front , rear speakers surround41:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio 4.1 surround output front, rear , subwoofer speakers surround50:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio 5.0 surround output front, center , rear speakers surround51:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio 5.1 surround output front, center, rear , subwoofer speakers surround71:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio 7.1 surround output front, center, side, rear , woofer speakers iec958:card=camerab409241,dev=0 usb camera-b4.09.24.1, usb audio iec958 (s/pdif) digital audio output root@scarlettpi:/usr/install/pocketsphinx-0.8#
took me while, with couple sources ( listed in answer ) , helpful hints nikolay-shmyrev, came answer worked me.
key assumptions:
running these commands pi user ( running them root, incorrect )
i'm using continuous recognition , looking ability "wake-up" raspberry pi. upon waking up, have other plans on how should interact.
my setup:
canakit raspberrypi
hdmi cable toshiba tv
usb wifi dongle
playstation 3 eye speech recognition
moving forward. ran following commands on raspberrypi pulseaudio + pocketsphinx working w/ playstation 3 eye. ( if see places improvement please let me know )
install pulse audio / development packages
sudo apt-get install gstreamer0.10-pulseaudio libao4 libasound2-plugins libgconfmm-2.6-1c2 libglademm-2.4-1c2a libpulse-dev libpulse-mainloop-glib0 libpulse-mainloop-glib0-dbg libpulse0 libpulse0-dbg libsox-fmt-pulse paman paprefs pavucontrol pavumeter pulseaudio pulseaudio-dbg pulseaudio-esound-compat pulseaudio-esound-compat-dbg pulseaudio-module-bluetooth pulseaudio-module-gconf pulseaudio-module-jack pulseaudio-module-lirc pulseaudio-module-lirc-dbg pulseaudio-module-x11 pulseaudio-module-zeroconf pulseaudio-module-zeroconf-dbg pulseaudio-utils oss-compat -y
setting alsa
per instructions http://forums.debian.net/viewtopic.php?f=16&t=12497
sudo \cp -pf /etc/asound.conf /etc/asound.conf.orig echo 'pcm.pulse { type pulse } ctl.pulse { type pulse } pcm.!default { type pulse } ctl.!default { type pulse }' | sudo tee /etc/asound.conf
make sure camera device loads on boot
_device_load_on_start=$(grep "snd.bcm2835" /etc/modules | wc -l) if [[ "${_device_load_on_start}" = "0" ]]; sudo \cp -pf /etc/modules /etc/modules.orig echo "snd-bcm2835" | tee -a /etc/modules fi # disallow module loading after startup. security feature since disallows additional module loading during runtime , on user request. _disallow_module_loading=$(grep "disallow_module_loading=1" /etc/default/pulseaudio | wc -l) if [[ "${_disallow_module_loading}" = "0" ]]; sudo \cp -pf /etc/default/pulseaudio /etc/default/pulseaudio.orig sudo sed -i "s,disallow_module_loading=1,disallow_module_loading=0,g" /etc/default/pulseaudio fi
set pulseaudio daemon network connections
# allow other clients on network connect pulseaudio daemon ( add auth-anonymous=1 if know every machine on lan ... security risk otherwise ) sudo \cp -fvp /etc/pulse/system.pa /etc/pulse/system.pa.orig echo " # scarlettpi added load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1;192.168.0.0/24 auth-anonymous=1 load-module module-zeroconf-publish" | sudo tee -a /etc/pulse/system.pa echo " # scarlettpi added #load-module module-native-protocol-tcp #load-module module-zeroconf-publish load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1;192.168.0.0/24 auth-anonymous=1 load-module module-zeroconf-publish" | sudo tee -a /etc/pulse/default.pa # check make sure looks okay cat /etc/pulse/default.pa
change default sound driver alsa pulseaudio
sudo \cp -fvp /etc/libao.conf /etc/libao.conf.orig sudo sed -i "s,default_driver=alsa,default_driver=pulse,g" /etc/libao.conf # daemon settings according pi-musicbox ( https://github.com/woutervanwijk/pi-musicbox ) sudo \cp -fvp /etc/pulse/daemon.conf /etc/pulse/daemon.conf.orig echo " # scarlettpi added high-priority = yes nice-level = 5 exit-idle-time = -1 resample-method = src-sinc-medium-quality default-sample-format = s16le default-sample-rate = 48000 default-sample-channels = 2" | sudo tee -a /etc/pulse/daemon.conf
add pi
user pulse access group
sudo adduser pi pulse-access # shut down machine make sure settings made loaded correctly sudo shutdown -r
make sure add /usr/local/lib
library path
export ld_library_path=/usr/local/lib export pkg_config_path=/usr/local/lib/pkgconfig # add these .bashrc set once login echo " # scarlettpi added export ld_library_path=/usr/local/lib export pkg_config_path=/usr/local/lib/pkgconfig" | tee -a ~/.bashrc
install base pocketsphinx
# install python dev packages sudo apt-get install python2.7-dev -y # sphinxbase install ( required install pocketsphinx ) sudo apt-get install bison -y cd ~pi/ wget http://downloads.sourceforge.net/project/cmusphinx/sphinxbase/0.8/sphinxbase-0.8.tar.gz tar -xvf sphinxbase-0.8.tar.gz cd sphinxbase-0.8 ./configure make sudo make install cd - # pocketsphinx install # set this: ld_library_path=/path/to/pocketsphinxlibs /usr/local/bin/pocketsphinx_continuous # http://www.voxforge.org/home/forums/message-boards/speech-recognition-engines/howto-use-pocketsphinx wget http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/pocketsphinx-0.8.tar.gz tar -xvf pocketsphinx-0.8.tar.gz cd pocketsphinx-0.8 ./configure make sudo make install cd - # install sphinxtrain wget http://sourceforge.net/projects/cmusphinx/files/sphinxtrain/1.0.8/sphinxtrain-1.0.8.tar.gz tar -xvf sphinxtrain-1.0.8 cd sphinxtrain-1.0.8 ./configure make sudo make install cd -
check if pulse daemon running
ps aux | grep pulse # if isn't, start ( need figure out best way make run on boot...init.d script maybe? ) /usr/bin/pulseaudio --start --log-target=syslog --system=false
finally, run sphinx
important note have user pi , pulseaudio server needs running
assumimg existing corpus file, .jsgf file, .dic, , .lm files (using lmtool)
cd ~pi/pocketsphinx-0.8 pocketsphinx_continuous -lm /home/pi/scarlettpi/config/speech/lm/scarlett.lm -dict /home/pi/scarlettpi/config/speech/dict/scarlett.dic -hmm /home/pi/scarlettpi/config/speech/model/hmm/en_us/hub4wsj_sc_8k -silprob 0.1 -wip 1e-4 -bestpath 0
references:
- advice on how calibrate pocketsphinx correctly
- how pocketsphinx recognize new words via corpus
- best/simplest explanation of how java speech grammar format works
i plan on adding more details behind why used setting, configurations in blog post i'm writing on home automation project, figured, i'd share i've done far incase else stuck me , move forward they're working on. hope helps someone. advice guys.
Comments
Post a Comment