try looking at a couple other api's....
speech demo : has source here and is discussed here and operated on CLI here
you could use the full duplex google api ( its rate capped at 50 per day )
Or if you like that general idea check ibm's watson discussed here
IMO - its more complex but not capped .