Updated on 1/9/2021 to discuss newer software versions.
A hands-free coding environment has a lot of moving parts, which can be overwhelming at first. This post will teach you how to set up the basic voice recognition environment. I also use eye tracking, but I’ll cover that in a separate post.
To begin with, install Dragon NaturallySpeaking, the voice recognition engine. Sadly, it’s only available for Windows, so you’ll have to do Linux development using a virtual machine or remote access (see my post for advice). There are two Dragon/Windows combinations I recommend. Most folks will be happiest using Windows 10 with Dragon 14 or 15, simply because it works well enough and Windows 10 is a major upgrade from 8.1. If you decide to go with Dragon 15, have a look at my recent post for installation and setup advice. If you don’t mind using an older version of Windows, I recommend Windows 7 or 8 with Dragon 12 (and of those two, Windows 8 is slightly more buggy with Dragon than 7). Later versions of Dragon (13+) don’t support select-and-say in most apps, which is a significant limitation (this lets you easily make edits on the last utterance you just dictated with built-in commands).
I recommend investing in a good microphone; I recommend either Sennheiser ME3 or TableMike. These are not cheap (about $200), but the microphone matters a lot: Dragon can be frustrating, so you want to do everything you can to minimize that.
Next, install NatLink, an extension to Dragon that makes it possible to add custom Python commands. I use the experimental Python 3 version because some of my functionality requires Python 3 (e.g. the fastest version of my gaze-ocr package). Unfortunately, this version is still under heavy development and is frequently broken or has out of date documentation. You may need to piece together the installation instructions across both the the Natlink site and the readme. If you want a simpler installation process and don’t need all the latest cutting edge features, you can follow the instructions for Python 2.7. You can use their prepackaged version of Python or pip install the dependencies yourself with the latest Python 2.7.x 32-bit (64-bit won’t work). When installing Python, be sure to install for all users and add it to the PATH (you’ll have something like “C:\PythonXX\” and “C:\PythonXX\Scripts\” in your PATH if you’ve done this correctly). If everything works, you’ll see a window pop up after starting Dragon titled “Messages from NatLink”. It’s common to run into problems installing NatLink, so read the instructions carefully and look at the unofficial and official troubleshooting advice.
Finally, install Dragonfly, a cleaner Python interface to NatLink. The original GitHub repository is no longer maintained, so the community has switched to a fork, dictation-toolbox/dragonfly, which you can install via pip install dragonfly2
. It’s just a Python library, so if it worked you should now be able to import dragonfly
from Python.
To get started with Dragonfly, I recommend looking at some example modules. You can check out the original repository of examples or modules mentioned in the docs. For voice coding purposes, you’ll want to familiarize yourself with the multiedit module. Just drop a module into your NatLink MacroSystem directory, turn your microphone off and on, and NatLink will attempt to load it. If it’s not working, check the messages window to see if there are any error messages.
Of course, this is just the beginning. The interesting part is extending the Dragonfly modules and writing your own to support a full-featured voice coding environment. I’ll cover that in future posts!
I have to disagree with you about Dragon 13 being the best. Although the browser plug-ins are nice and it’s a little bit more accurate out-of-the-box, the decision Nuance made, to allow Select-and-Say only in specific applications (as opposed to everywhere in previous versions) actually caused me to downgrade.
Wow, I didn’t realize Dragon 12 supported this in every app. I updated my post with this information. I’m considering downgrading too. It’s pretty painful not having this in Chrome, and I find the Chrome extension that adds this causes other problems. Thanks!
I played around with this a bit on an old laptop with Dragon 12. The problem I found, at least in Google Chrome, is that Select-and-Say ceases to work as soon as you use a custom movement or editing command. Contrast this with the dictation box, where you can mix and match custom commands and standard Dragon commands. Do you just try to avoid this mixing, or do you only use Select-and-Say to fix the most recent utterance?
Select-and-say in nonstandard applications only works on the utterances since the last voice command. This is still extremely useful, however. Totally worth skipping DNS 13 for this reason alone.
I bit the bullet this weekend and downgraded to Dragon 12. Suffice it to say I don’t miss a single Dragon 13 feature, and it is wonderful having Select-and-Say working again. I updated the post with a strong recommendation for Dragon 12. Thanks!
It’s as you say, I only use Select-and-Say to fix the most recent utterance.
There are plenty of other good microphones; which is best for you will depend on things like do you want mono or stereo headphones? Do you want to have no wires? How noisy is your environment? And so on.
“Just drop a module into your NatLink MacroSystem directory, turn your microphone off and on, and NatLink will attempt to load it.”
NatLink should reload changed modules at the next utterance automatically. Vocola works this way.
Natlink doesnt reload at the next utterance at least not in my experience. Also are you the same Mark who made Vocola?
sigh It used to. Quintijn changed the default for this I believe in hopes of decreasing command latency. Vocola still provides the old behavior. This is probably fixable without too much trouble.
(And yes, I am the maintainer of Vocola 2; Rick is the original creator.)
I actually like the current behavior. This gives me more control over when I am ready to reload the module, so I can check for bugs. In principle I could do this by avoiding saving, but this makes it easier to use static analysis tools (plus I just like to save regularly).
I think modules can also be reloaded with a voice command, although I don’t currently use that.
For automatically reloading the grammar on a file save, you can put
natlinkmain.setCheckForGrammarChanges(1)
into your load function.I’ve always had to turn my microphone off and on. I just double checked. Here is the version I’m using:
NatLink version: 4.1lima
DNS version: 13
Python version: 27
Windows Version: 8
Hi there. I want to thank you for making this blog post. I’m very interested in setting this up on Mac OS X. Do you still recommend those current versions of all those programs [1]?
[1] http://handsfreecoding.org/?p=9#comment-128
Sorry for the slow reply!
These programs are Windows-only. I haven’t tried to do voice dictation on Mac OS X, but the most complete package available is probably voicecode.io. There are also some folks working on open source packages, but they are still in the early stages AFAIK.
Remember that you can use Linux and Mac from within Windows (see my post). If you decide to go with Windows, I think there are two decent configuration options:
1) Windows 7, Dragon 12.
2) Windows 10, Dragon 14.
The advantage of 1 is that you can use Select-and-Say to edit recent utterances in any app. This means improved ease of editing without having to start a separate text editor. The advantage of 2 is simply that you get newer, shinier software (I haven’t used either yet so I can’t really speak to details).
Hi James,
I’m following your blog to get started with voice coding in Python. I’m running Dragon 14 on Windows 10. I see in your last comment (2015) that you haven’t tried either. Do you have more information now about how well this config works for voice coding? I see you have mentioned that Dragon 14 doesn’t support Select-and-Say in every single application. I can’t find information online about whether this still is the case, and how this impacts voice coding. In 2017, are things different?
Hi Kariina,
Indeed, nothing has changed since my post. Dragon 14 should be fine except for this one caveat, which is not actually relevant to voice coding per se, but presumably you also want to do some plain English dictation, which would be affected.
Definitely avoid Dragon 15 for now. It has showstopper bugs that prevent it from being used effectively with Natlink/Dragonfly. This is unfortunate, to say the least, because apparently it is by far the best version of Dragon otherwise. Here is a thread with more details: http://knowbrainer.com/forums/forum/messageview.cfm?catid=25&threadid=22911
Thanks for your reply. Another question: do you know of any online communities for voice coding? I know about KnowBrainer, I’m wondering if there are others. The reason being: I do not know how to program in Python, or any other programming language. I thought that learning to code with my voice would be a good way to spend the free time I have because of my arm injuries. I have downloaded Vocola, because Dragonfly requires Python knowledge. However, I’m still utterly confused about how to create commands to program in a language that I don’t know. Therefore, I’m wondering if it is possible to download someone else’s voice coding configuration and get started with learning Python. Do you know of any Internet community, for example, on Github?
I know of a couple others:
VoiceCoder is an active Yahoo group that has been around forever:
https://groups.yahoo.com/neo/groups/VoiceCoder/info
Dragonfly has its own mailing list: https://groups.google.com/forum/#!forum/dragonflyspeech
If you are looking for repositories to fork, you’ll find several linked off the dragonfly list. Mine is linked in the navbar (https://github.com/wolfmanstout/dragonfly-commands).
Python is a great first language to learn, although learning to program for the first time is always an exercise in patience (which I’m sure you are familiar with as someone with arm injuries). I highly recommend starting by just trying to get web browsing commands to work (e.g. based on my repository and my post). Then you can more easily navigate Python documentation and dig deeper.
Hello,
I appreciated the quick response I received from the Dragonfly programming community. Thank you for the help. I started replacing my Dragon scripts with Dragonfly codes. I’m curious, and a little puzzled on three cases though and I have not found an explanation in the documentation. Might someone point me in the right direction?
I’m trying to use the Mimic command to mimic the HeardWord function in the Dragon scripting. For example, I replaced the Dragon command, “delete previous three words” with “three zapper”.
In the scripting language that looks like:
HeardWord (“delete”,”previous”, ListVar1,”words”)
Is there a way to use the Mimic command to achieve the same result as the Dragon script? I can achieve something similar using the Windows hotkeys, but it’s not quite as effective. I’ve tried various combinations on the theme of:
“[] testing”: release + Mimic(extra=”select [] words back”) + Key(del),
but to no avail.
I’ve also tried the well-known keyboard combination:
“three (fingered|finger) salute”: Key(“alt:down, ctrl:down, del:down”),
“three (fingered|finger) salute”: Key(“alt:down”)+ Key(“ctrl:down”)+ Key(“del:down”),
… but neither of these two variations on the theme work. Any ideas?
Lastly, it’s not clear to me how the Repeat command operates, as neither of these commands works.
“[] (hashtag|hashtags)”: release + Key(“hash, enter”)Repeat(n),
“remove [] (hashtag|hashtags)”: release + Key(“del, down”)Repeat(count=”n”),
I’m certain there’s a simple explanation for these, but I’m a little puzzled by the subtleties of the syntax and I’d be grateful for any pointers.
Thank you so much, Matt
For Mimic, it doesn’t look like it has great built-in support for combining static words with dynamic words as in your example. Most folks don’t use Mimic heavily — it’s only really useful if there’s no good keyboard-based alternative. You can do simple examples using this syntax:
Mimic("delete", "previous", "three", "words")
. If you want to do something more clever you can always create your own custom action, using the source as an example: https://github.com/t4ngo/dragonfly/blob/e35cef2eca226b1fc0570ca9760ca07ca4b3a8a9/dragonfly/actions/action_mimic.pyFor ctrl-alt-delete, have you tried
Key("ca-del")
? That’d be the canonical syntax for this sort of thing, although I’m not sure why your examples didn’t work.Finally, it looks like the formatting on your last example got messed up by Markdown, but I think I know what the issue is. If you want to hardcode a count such as 3, use
count=3
. If you want to dynamically use a count from your command, useextra="n"
where “n” must also be the extra you define in your grammar.Hello,
Thank you for this information. I was able to install both dragonfly and natlink successfully after some troubleshooting, but I am now getting an error after placing the multi-edit file into the macro system folder. I am a beginner programmer, and not quite savvy yet on the troubleshooting. Here is the error:
” UnimacroDirectory: C:\NatLink\Unimacro
setting shiftkey to: {shift} (language: enx)
start of natlinkstartup
Vocola not active
Error loading _multiedit from C:\NatLink\NatLink\MacroSystem_multiedit.py
Traceback (most recent call last):
File “C:\NatLink\NatLink\MacroSystem\core\natlinkmain.py”, line 317, in loadFile
imp.load_module(modName,fndFile,fndName,fndDesc)
File “C:\NatLink\NatLink\MacroSystem_multiedit.py”, line 1
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:19:22) [MSC v.1500 32 bit (Intel)] on win32
^
SyntaxError: invalid syntax
— skip unchanged wrong grammar file: C:\NatLink\NatLink\MacroSystem_multiedit.py
— skip unchanged wrong grammar file: C:\NatLink\NatLink\MacroSystem_multiedit.py
natlinkmain started from C:\NatLink\NatLink\MacroSystem\core:
NatLink version: 4.1tango
DNS version: 13
Python version: 27
Windows Version: 8or10
— skip unchanged wrong grammar file: C:\NatLink\NatLink\MacroSystem_multiedit.py
— skip unchanged wrong grammar file: C:\NatLink\NatLink\MacroSystem_multiedit.py”
I’d appreciate any suggestions, and they keep your time.
Thanks,
Tiff
Based on the error, it sounds to me like some sort of a copy/paste issue — like the contents of _multiedit.py have got some garbage at the top. I would recommend deleting it and copying it in again, making sure to copy in just the raw contents of the file.
Hello James, thank you for your response. Also, in my previous response, I meant to say “thank you for your time”. I am still quite new to Dragon. I was able to get it to work, after removing the syntax errors. Furthermore, I was able to create a very simple mapping! I do have a further question. How do I know how to format the action objects correctly? Is there some sort of online resource? For example, if I wanted to map “paren” to Key(“()”), I receive an error in Natlink for this line. I was able to get “some words I speak” to Key(“a, b, c”), without error. I’m very new to programming, and just want to make some simple edits to the “multiedit.py” for now , but am not sure how to add certain keys. Maybe these need to be spelled out? Also, where is the best place for me to ask questions like this, so I’m not taking up the entire comment section! 🙂
Thanks again,
Tiffany
This is a fine place to ask questions 🙂
Here’s the documentation you are looking for:
https://pythonhosted.org/dragonfly/actions.html#key-action
You could accomplish what you are trying to do with either Text(“()”) or Key(“lparen, rparen”). I’d recommend the Text version so you don’t have to look up key names!
I would like to be able to program in HTML 5 & JavaScript. Will this configuration work for that?
Thanks!
Definitely! My setup is very language-independent. Emacs is a good editor for working with HTML 5 and JavaScript (and anything else, for that matter), although it does have a significant learning curve.
Is Natlink for Python only? I’m learning to code in Java, can I still use Natlink to create Java commands or do I need to use something else?
Natlink is a Python-only interface that lets you configure speech commands, but those speech commands can be used for whatever you can dream up, and that definitely includes helping you program in Java in Eclipse or Emacs or your editor of choice, for example. If you are new to coding in general, though, I would recommend considering Python as your starting language, both because it’s well-suited to learning to code and that way you will learn the language you’ll need to know to edit your grammars.
You could check Talon to program by voice. It is being actively developed and even has support for tobii 4c eye tracker allowing you to mouse hands-free.
https://talonvoice.com/
Here are some videos showing how it works:
Talon Voice – Python Demo
https://youtu.be/ddFI63dgpaI
Talon Eye Tracking – Zoom Mouse :
This is a mousing demo for the https://talonvoice.com project that uses eye tracking with a fast noise recognizer.
https://youtu.be/VMNsU7rrjRI
Hi. I have just started looking into voice coding during the last few weeks, as I have started having a lot of pain in my shoulders this last year when programming. However, I feel the subject is somewhat overwhelming, with many solutions. I would appreciate your input on where to start.
As I understand, on this site you more or less recommend to use DragonFly directly, and build my own grammer, perhaps using other repos as inspiration.
On the other hand, I have looked at solution like Casper. From my current understanding, this builds on top of DragonFly, provides some extra functionality, and some functions specific to different IDEs.
Which would approach would you recommend taking, and why?
As a background, I program mostly web applications with Java, Javascript, HTML and CSS (often the frontend framework React). The IDE I normally work in is IntelliJ, or sometimes VSCode.
It’s a good question. I need to take a closer look at some of the new frameworks that have popped up (Caster and Talon in particular). Back when I started this, Dragonfly and Vocola were pretty much the only games in town, so that’s why my system is built directly atop Dragonfly (albeit with some extensions of my own). So, if you’ve looked at Caster and you like what you see, I say run with it!
Hello, Tiffany again from well over a year ago. I have been happily using Dragon/Spyder with Python 2.7/Natlink/Dragonfly combo for a bit but ran into memory error issues that are apparent in Python 2.7. Is there a way to write macros with 3.7 yet? I’m kind of in a bind with code that needs more memory.
The short answer is “not yet, but there are folks actively working on getting Dragon to work with Python 3.7”. I’d like to understand the problems you are having, though. Why would Python 2.7 limit your memory usage? Are you referring to the fact that NatLink is restricted to 32-bit Python? Also, is the issue that you need more memory within the Python runtime, or elsewhere? I’ve never heard of folks needing more than 4 GB of memory within the NatLink Python runtime. Perhaps you could do whatever you are doing in a separate service that this runtime communicates with?
Sorry, I realize I’m sure late responding to this post. I actually just updated to the latest version of Python and didn’t have any issues. I am using Spyder though.
I need to use Linux for some astronomy research stuff. I’m probably going to have to do a dual boot to keep using Dragon etc.
The memory issues were from python itself, not the voice coding stuff. I forget what the specific issue was, but my coworkers and I determined it was an issue with that version of python. I’m happily running 3.7 and am still able to make macros!
For what it’s worth, there is now a cross platform (runs natively on Linux) open source speech recognition dragonfly backend: https://github.com/daanzu/kaldi-active-grammar
+1! I have this linked from my homepage now.
Hi James,
I am getting started with hands free programming and it’s all quite overwhelming, so I appreciate your post.
Could you tell me if I need to get Dragon 15 Professional or Home? I suspect it will be Pro, but I don’t want to buy the wrong one.
Thanks,
James
This is a bit tricky. I use Dragon 15 Professional Individual, but that’s because it was the only one available when I purchased it (I didn’t realize Dragon 15 Home even came out until reading your comment). That said, in earlier versions I used Home without any trouble. So I think you are fine with Dragon 15 home unless they did something with this version to make it worse. I think it is safe to assume that Dragonfly will work fine with Home, but you may want to take a look at the feature list to see if there’s anything else that’s missing that you want.
Thanks for the reply James.
I’ll purchase the home edition and get set up with Dragonfly using that.