Silvius: Online Programming by Voice

Silvius is an open source system for writing code by voice. Speak characters and words and they are typed for you automatically. You can try it on our server, or set it up yourself.

Silvius is a flexible way to perform speech recognition on remote servers, but kaldi-active-grammar has much better accuracy. Check it out here.

Setup Instructions

Instructions below are for Linux. See here for Mac OS and Windows instructions.

Install dependencies:

$ sudo apt-get install git python python-ws4py python-pyaudio libjansson4 xdotool
Download client: (GitHub page )

$ git clone https://github.com/dwks/silvius.git && cd silvius
Make sure your microphone works (say words and see if they are recognized):

$ python stream/mic.py -s silvius-server.voxhub.io -p 8018

To select a different microphone, run python stream/list-mics.py and note the correct device number. Then pass -d N to mic.py.
Run and speak commands!

$ python stream/mic.py -s silvius-server.voxhub.io -p 8018 | python grammar/main.py

Voice Command Cheat-Sheet

Some commands from the default grammar are shown. For instructions on modifying the grammar, see here.

Letters26

arch bravo charlie delta echo fox golf hotel india julia kilo line mike november oscar papa queen romeo sierra tango uniform victor whiskey xray yankee zulu

Uppercase letters1

sky arch

Cursor movement5

up, down, left, right, up three

Chaining1

Say any sequence of commands: charlie delta space dot dot slap

Individual words1

word something (types full word in lowercase)

Multiple words2

phrase hello there (all lowercase), sentence hello there (capitalizes first)

Basic editing4

slap (newline), scratch (backspace), act (escape), slap three, etc

Other characters23

act colon single-quote double-quote equal space tab bang hash dollar percent carrot ampersand star late rate minus underscore plus backslash dot slash question

Public Server List

Pass -s server -p port to stream/mic.py to use any server on this list.

Name/Description	Server	Port	Workers?
Original Silvius model English model with command words boosted.	silvius-server.voxhub.io	8019	2+
Beta Silvius model Dual command and English model. Better at recognizing commands.	silvius-server.voxhub.io	8018	2

Running a Custom Server

To run a local instance of the recognition engine with the original Silvius speech model, download this archive (609MB) and follow the instructions in INSTALL. Or, use this docker image. Finally, this is more difficult, but you can also build from source . When running, each recognition worker needs about 2.4GB of RAM.

The beta model is available here.

It is possible to train a new model with custom command words. Please refer to this repository . For help with this, or to request that a particular model be created on your behalf, please contact me or join the mailing list (mentioned below).

Learning More

Check out the HOPE XI talk about Silvius, on Livestream and YouTube and embedded below.

For more information, join the Silvius mailing list or contact me at dwk at voxhub dot io.

Acknowledgments

Silvius is built on the Kaldi speech recognition toolkit, which does all the hard work. We used Voxforge and Tedlium speech models. The client/server code is based on Tanel Alumäe's gstreamer server (see a demo!). This project would not have been possible without the guidance of Professor Homayoon Beigi, and the contributions of several collaborators.

Silvius is inspired by a similar voice coding system called Aenea, which is built on the commercial Dragon NaturallySpeaking. In Virgil's epic, Silvius is the son of Aeneas. Aenea currently has superior performance; to learn more, see Travis Rudd's PyCon presentation and join the excellent Dragonfly mailing list.