In this post, I will be showing you how to use Google's simple speech recognition engine in order to make your own speech-to-text program using Python. Speech Recognition is a really good tool for many apps since inputting with spoken words is typically more practical than typing. However, the Google Speech API requires you to be online since it has to request from their servers. From this easy tutorial, I will then show you how to toggle an LED through speech recognition and serial communication with Arduino which is a little more complicated but still trivial.
Before we get started, we have to install a few libraries on Python first in order to get everything working:
Python 2.6 + ( I use 2.7.13)
PyAudio 0.2.9 + (required if you are using a built-in microphone input)
You can download these libraries using pip install in Terminal if you are using OS X or in a terminal for Windows
For PyAudio, you will have to install PortAudio through Homebrew first
brew install portaudio
and then use
pip install pyaudio (replace pip with pip3 if using Python 3)
For SpeechRecognition library, just use
pip install SpeechRecognition
Once you have all these libraries, the actual code is pretty simple, where the simple logic can be easily understood.
You can check whether you have all the libraries listed above installed or not by typing in pip list in Terminal
Another great alternative to the Google Speech Recognition engine is PocketSphinx which is a great offline engine.
Here is the code below:
import speech_recognition as rc
rec = rc.Recognizer()
with rc.Microphone() as source:
audio = rec.listen(source)
print("I cannot understand what you said")
except rc.RequestError as e:
word = rec.recognize_google(audio)
if (word == 'goodbye'):