Silvius: Online Programming by Voice

Silvius is an open source system for writing code by voice. Speak characters and words and they are typed for you automatically. You can try it on our server, or set it up yourself.

Setup Instructions

Instructions below are for Linux. See here for Mac OS and Windows instructions.

  1. Install dependencies:

    $ sudo apt-get install git python python-ws4py python-pyaudio libjansson4 xdotool
  2. Download client: (GitHub page )

    $ git clone https://github.com/dwks/silvius.git && cd silvius
  3. Make sure your microphone works (say words and see if they are recognized):

    $ python stream/mic.py -s silvius-server.voxhub.io

    To select a different microphone, run python stream/list-mics.py and note the correct device number. Then pass -d N to mic.py.

  4. Run and speak commands!

    $ python stream/mic.py -s silvius-server.voxhub.io | python grammar/main.py

Voice Command Cheat-Sheet

Some commands from the default grammar are shown. For instructions on modifying the grammar, see here.

Public Server List

Pass -s server -p port to stream/mic.py to use any server on this list.

Name/DescriptionServerPortWorkers?
Original Silvius model
English model with command words boosted.
silvius-server.voxhub.io 8019 2+
Beta Silvius model
Dual command and English model. Better at recognizing commands.
silvius-server.voxhub.io 8018 2

Running a Custom Server

To run a local instance of the recognition engine with the original Silvius speech model, download this archive (609MB) and follow the instructions in INSTALL. Or, use this docker image. Finally, this is more difficult, but you can also build from source . When running, each recognition worker needs about 2.4GB of RAM.

The beta model is available here.

It is possible to train a new model with custom command words. Please refer to this repository . For help with this, or to request that a particular model be created on your behalf, please contact me or join the mailing list (mentioned below).

Learning More

Check out the HOPE XI talk about Silvius, on Livestream and YouTube and embedded below.

For more information, join the Silvius mailing list or contact me at dwk at voxhub dot io.

Acknowledgments

Silvius is built on the Kaldi speech recognition toolkit, which does all the hard work. We used Voxforge and Tedlium speech models. The client/server code is based on Tanel Alumäe's gstreamer server (see a demo!). This project would not have been possible without the guidance of Professor Homayoon Beigi, and the contributions of several collaborators.

Silvius is inspired by a similar voice coding system called Aenea, which is built on the commercial Dragon NaturallySpeaking. In Virgil's epic, Silvius is the son of Aeneas. Aenea currently has superior performance; to learn more, see Travis Rudd's PyCon presentation and join the excellent Dragonfly mailing list.