kaldi-active-grammar: Flexible Coding by Voice

kaldi-active-grammar (or KAG) is an open source system for writing code by voice. It uses a local speech model, which can even be customized for your own voice. KAG has extremely low recognition latency, and very good command accuracy, because it dynamically adjusts the set of words it expects based on your Python grammar. Try voice coding today!

Setup Instructions

Instructions below are for Linux. See here for Mac OS and Windows instructions.

  1. Install dependencies (Python 3.6+ recommended, 2.7 also works):

    $ sudo apt install xdotool python3 python3-venv
  2. (Recommended) Create a python virtual environment to keep dependencies organized:

    $ python3 -m venv env && source env/bin/activate && \
         python -m pip install --upgrade pip setuptools wheel
  3. Install kaldi-active-grammar and dragonfly integration:

    $ python -m pip install kaldi-active-grammar 'dragonfly2[kaldi]'
  4. Download a voice grammar such as dwk's kaldi-grammar-main (or daanzu's kaldi-grammar-simple ):

    $ git clone git@github.com:dwks/kaldi-grammar-main.git && cd kaldi-grammar-main
  5. Download the latest speech model from here. At the time of this writing, the latest is kaldi_model_daanzu_20200905_1ep-biglm.zip.

    $ wget $LATEST_MODEL && unzip kaldi_model_*.zip
  6. (Optional) Make sure the right microphone is used: run

    $ python kaldi_module_loader.py -l
    and set input_device_index in kaldi_module_loader.py to the appropriate microphone number. Or, if using pulseaudio, change the microphone while the loader is running with pavucontrol.

  7. Run the grammar and wait for the initial first compile (subsequent runs will be much faster):

    $ python kaldi_module_loader.py
  8. Once you see the message "engine (INFO): Listening...", you can speak commands!

Voice Command Cheat-Sheet

Some commands from kaldi-grammar-simple are shown. See also this PDF cheatsheet.

Learning More

Check out daanzu's live demo, on YouTube.

For additional questions or comments on this page, please contact me at dwk at voxhub period io.

For more information, join this messaging room. Many voice coders frequent it, including dwk and daanzu.

Acknowledgments

The author of kaldi-active-grammar is David Zurow (daanzu). Please support him if you find this project useful.