kaldi-active-grammar (or KAG) is an open source system for writing code by voice. It uses a local speech model, which can even be customized for your own voice. KAG has extremely low recognition latency, and very good command accuracy, because it dynamically adjusts the set of words it expects based on your Python grammar. Try voice coding today!
Instructions below are for Linux. See here for Mac OS and Windows instructions.
Install dependencies (Python 3.6+ recommended, 2.7 also works):
$ sudo apt install xdotool python3 python3-venv
(Recommended) Create a python virtual environment to keep dependencies organized:
$ python3 -m venv env && source env/bin/activate && \
python -m pip install --upgrade pip setuptools wheel
Install kaldi-active-grammar and dragonfly integration:
$ python -m pip install kaldi-active-grammar 'dragonfly2[kaldi]'
$ git clone firstname.lastname@example.org:dwks/kaldi-grammar-main.git && cd kaldi-grammar-main
$ wget $LATEST_MODEL && unzip kaldi_model_*.zip
(Optional) Make sure the right microphone is used: run
$ python kaldi_module_loader.py -l
input_device_indexin kaldi_module_loader.py to the appropriate microphone number. Or, if using pulseaudio, change the microphone while the loader is running with
Run the grammar and wait for the initial first compile (subsequent runs will be much faster):
$ python kaldi_module_loader.py
Once you see the message "engine (INFO): Listening...", you can speak commands!
The author of kaldi-active-grammar is David Zurow (daanzu). Please support him if you find this project useful.