Getting Started with Voice Coding

A hands-free coding environment has a lot of moving parts, which can be overwhelming at first. This post will teach you how to set up the basic voice recognition environment. I also use eye tracking, but I’ll cover that in a separate post.

To begin with, install Dragon NaturallySpeaking, the voice recognition engine. Sadly, it’s only available for Windows, so you’ll have to do Linux development using a virtual machine or remote access (see my post for advice). I recommend Windows 7, because Dragon NaturallySpeaking still has a lot of bugs in Windows 8.1 (another post for that). I recommend Dragon 12 over 13. Dragon 13 only supports select-and-say in particular apps, which is a huge limitation. I used Dragon 13 for several months before downgrading thanks to readers’ advice in the comments, and I don’t miss a single feature. Any edition is fine; I use the premium edition.

I recommend investing in a good microphone; the usual recommendation is Sennheiser ME3. It’s not cheap ($200), but it matters a lot: Dragon is pretty frustrating, so you want to do everything you can to minimize that.

Next, install NatLink, an extension to Dragon that makes it possible to add custom Python commands. Follow the instructions here. If everything works, you’ll see a window pop up after starting Dragon titled “Messages from NatLink”. It’s common to run into problems installing NatLink, so read the instructions carefully. For your first installation, I highly recommend using their prepackaged version of Python to avoid trouble.

Finally, install Dragonfly, a cleaner Python interface to NatLink. The prepackaged binaries are several years out of date, so I recommend cloning their git repository. Run python to install it. It’s just a Python library, so if it worked you should now be able to import dragonfly from Python.

To get started with Dragonfly, I recommend looking at some example modules. You can check out the original repository of examples or modules mentioned in the docs. For voice coding purposes, you’ll want to familiarize yourself with the multiedit module. Just drop a module into your NatLink MacroSystem directory, turn your microphone off and on, and NatLink will attempt to load it. If it’s not working, check the messages window to see if there are any error messages.

Of course, this is just the beginning. The interesting part is extending the Dragonfly modules and writing your own to support a full-featured voice coding environment. I’ll cover that in future posts!

19 thoughts on “Getting Started with Voice Coding”

  1. I have to disagree with you about Dragon 13 being the best. Although the browser plug-ins are nice and it’s a little bit more accurate out-of-the-box, the decision Nuance made, to allow Select-and-Say only in specific applications (as opposed to everywhere in previous versions) actually caused me to downgrade.

    1. Wow, I didn’t realize Dragon 12 supported this in every app. I updated my post with this information. I’m considering downgrading too. It’s pretty painful not having this in Chrome, and I find the Chrome extension that adds this causes other problems. Thanks!

    2. I played around with this a bit on an old laptop with Dragon 12. The problem I found, at least in Google Chrome, is that Select-and-Say ceases to work as soon as you use a custom movement or editing command. Contrast this with the dictation box, where you can mix and match custom commands and standard Dragon commands. Do you just try to avoid this mixing, or do you only use Select-and-Say to fix the most recent utterance?

      1. Select-and-say in nonstandard applications only works on the utterances since the last voice command. This is still extremely useful, however. Totally worth skipping DNS 13 for this reason alone.

        1. I bit the bullet this weekend and downgraded to Dragon 12. Suffice it to say I don’t miss a single Dragon 13 feature, and it is wonderful having Select-and-Say working again. I updated the post with a strong recommendation for Dragon 12. Thanks!

  2. There are plenty of other good microphones; which is best for you will depend on things like do you want mono or stereo headphones? Do you want to have no wires? How noisy is your environment? And so on.

  3. “Just drop a module into your NatLink MacroSystem directory, turn your microphone off and on, and NatLink will attempt to load it.”

    NatLink should reload changed modules at the next utterance automatically. Vocola works this way.

    1. Natlink doesnt reload at the next utterance at least not in my experience. Also are you the same Mark who made Vocola?

      1. sigh It used to. Quintijn changed the default for this I believe in hopes of decreasing command latency. Vocola still provides the old behavior. This is probably fixable without too much trouble.
        (And yes, I am the maintainer of Vocola 2; Rick is the original creator.)

        1. I actually like the current behavior. This gives me more control over when I am ready to reload the module, so I can check for bugs. In principle I could do this by avoiding saving, but this makes it easier to use static analysis tools (plus I just like to save regularly).

          I think modules can also be reloaded with a voice command, although I don’t currently use that.

        2. For automatically reloading the grammar on a file save, you can putnatlinkmain.setCheckForGrammarChanges(1) into your load function.

    2. I’ve always had to turn my microphone off and on. I just double checked. Here is the version I’m using:
      NatLink version: 4.1lima
      DNS version: 13
      Python version: 27
      Windows Version: 8

    1. Sorry for the slow reply!

      These programs are Windows-only. I haven’t tried to do voice dictation on Mac OS X, but the most complete package available is probably There are also some folks working on open source packages, but they are still in the early stages AFAIK.

      Remember that you can use Linux and Mac from within Windows (see my post). If you decide to go with Windows, I think there are two decent configuration options:
      1) Windows 7, Dragon 12.
      2) Windows 10, Dragon 14.

      The advantage of 1 is that you can use Select-and-Say to edit recent utterances in any app. This means improved ease of editing without having to start a separate text editor. The advantage of 2 is simply that you get newer, shinier software (I haven’t used either yet so I can’t really speak to details).

  4. Hi James,

    I’m following your blog to get started with voice coding in Python. I’m running Dragon 14 on Windows 10. I see in your last comment (2015) that you haven’t tried either. Do you have more information now about how well this config works for voice coding? I see you have mentioned that Dragon 14 doesn’t support Select-and-Say in every single application. I can’t find information online about whether this still is the case, and how this impacts voice coding. In 2017, are things different?

    1. Hi Kariina,

      Indeed, nothing has changed since my post. Dragon 14 should be fine except for this one caveat, which is not actually relevant to voice coding per se, but presumably you also want to do some plain English dictation, which would be affected.

      Definitely avoid Dragon 15 for now. It has showstopper bugs that prevent it from being used effectively with Natlink/Dragonfly. This is unfortunate, to say the least, because apparently it is by far the best version of Dragon otherwise. Here is a thread with more details:

      1. Thanks for your reply. Another question: do you know of any online communities for voice coding? I know about KnowBrainer, I’m wondering if there are others. The reason being: I do not know how to program in Python, or any other programming language. I thought that learning to code with my voice would be a good way to spend the free time I have because of my arm injuries. I have downloaded Vocola, because Dragonfly requires Python knowledge. However, I’m still utterly confused about how to create commands to program in a language that I don’t know. Therefore, I’m wondering if it is possible to download someone else’s voice coding configuration and get started with learning Python. Do you know of any Internet community, for example, on Github?

        1. I know of a couple others:

          VoiceCoder is an active Yahoo group that has been around forever:

          Dragonfly has its own mailing list:!forum/dragonflyspeech

          If you are looking for repositories to fork, you’ll find several linked off the dragonfly list. Mine is linked in the navbar (

          Python is a great first language to learn, although learning to program for the first time is always an exercise in patience (which I’m sure you are familiar with as someone with arm injuries). I highly recommend starting by just trying to get web browsing commands to work (e.g. based on my repository and my post). Then you can more easily navigate Python documentation and dig deeper.

Leave a Reply

Your email address will not be published. Required fields are marked *

Markdown is supported. Make sure raw < and > are wrapped in code blocks. You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>