Getting Started with Eye Tracking

You can do a lot just using your voice, but there are still a few times you’ll find yourself reaching for a mouse. It’s often for the silliest little things, like clicking in the empty space within a webpage to change the keyboard context. If you’re serious about not using your hands, you can use an eye tracker to eliminate these last few cases. This post will teach you how to get started. Make sure you’ve read my introductory post on voice coding, since will be building upon that.

Eye trackers used to cost several thousand dollars, but now you can grab a cheap one for less than a couple hundred bucks. I use the Tobii EyeX dev kit, which retails right now for $139. Its major competitor is the $100 EyeTribe dev kit, a kickstarter-funded project. I haven’t played around with that yet, but I’d love to hear if you have in the comments.

The basic idea behind eye tracker interaction is that you look somewhere on the screen and then use some other method for clicking or “activating” the item you’re looking at. It’s generally too distracting to have the pointer follow wherever you’re looking, so usually a keypress instead of a click is used. For our purposes, of course, we’ll want to use a voice command.

The tricky part is integrating it with Dragonfly. It really ought to be easy, except that right now there’s an outstanding bug where their software does not listen for virtual keypresses. There’s a thread in their forums complaining about this, but it sounds like it won’t be fixed until the consumer version, which doesn’t have a release date yet. The workaround is surprisingly elaborate, but the good news is I’ve already done the heavy lifting. The basic idea is that we will call into their C API from Python. The raw API is extremely complicated for our needs, so I wrote a simple wrapper DLL with a few basic functions to connect to the eye tracker, get position data, and activate the current gaze point. You can get the source code and binary distribution of the wrapper from my github repository.

Python makes it a breeze to call into a DLL. Load the DLL with the following lines:

from ctypes import *
eyex_dll = CDLL(DLL_DIRECTORY + "/Tobii.EyeX.Client.dll")
tracker_dll = CDLL(DLL_DIRECTORY + "/Tracker.dll")

Then you can define some simple wrapper functions that call the DLL functions:

def connect():
    result = tracker_dll.connect()
    print "connect: %d" % result

def disconnect():
    result = tracker_dll.disconnect()
    print "disconnect: %d" % result

def get_position():
    x = c_double()
    y = c_double()
    tracker_dll.last_position(byref(x), byref(y))
    return (x.value, y.value)

def print_position():
    print "(%f, %f)" % get_position()

def move_to_position():
    position = get_position()
    Mouse("[%d, %d]" % (max(0, int(position[0])), max(0, int(position[1])))).execute()

def activate_position():
    tracker_dll.activate()

With these in place, it’s easy to bind them to voice commands using the handy dragonfly Function action. It’s useful to have separate commands for moving the pointer and clicking, because the eye tracker accuracy isn’t always perfect.

Foot pedals are another alternative to voice commands. I often use a voice command to move the mouse based on my gaze point, then use my foot to click. I recommend the Omnipedal Quad. These are also great for scrolling, which is pretty awkward with dictation.

There’s a lot more you can do with a tighter integration with the eye tracking API. The major shortcoming of my simple approach is that it doesn’t work well with small click targets. The full API lets the application describe all the click targets, so the closest one will be automatically picked. Of course, this usually requires access to the application source code (or at least an extension), so it’s less generic and harder to get up and running. Please post in the comments if you come up with something!

16 thoughts on “Getting Started with Eye Tracking”

  1. Hi James,
    My name is Tim and I am the user on the tobii forums that started the post concerning voice integration that we both have been updating. I’m very interested in getting this working with VAC but dont have the coding prowess that you do to do so. I read through your blog and DLed the tracker.dll from github and would greatly appreciate any help you could give in getting the above wrapper to listen for keypresses. Can this be done with an elseif statement or a while loop to trigger off the two or three keys that we want with sendinput or something similar? I tried to use AHK to send the virtual key code, the hardware scan code, and key down key up commands but it still wont trigger from Tobii’s prebuilt app. If i use sendplay instead of sendinput it bypasses tobii and still sends the key through voice commands but at the same time doesnt trigger at a low enough level for what im guessing would be tobii’s keyboard hook to see it and trigger the mouse movement.

    Thanks so much for any help!
    Tim

    1. Hi Tim, welcome to the site! I don’t think I’m familiar with the voice command software you’re referring to, but I’ll do my best to help. First, the key thing to understand is that my wrapper does not make software keypresses suddenly start working. Instead, it offers a few simple functions you can call *instead* of the software keypress functions. You can see the list of these functions in the header file: https://github.com/wolfmanstout/simple-eye-tracker/blob/master/Tracker.h

      Hence, it requires that your voice command software be able to call into a DLL. This definitely works in Python using Dragonfly with Dragon NaturallySpeaking, but the process is going to be different if you’re using a different environment (it should be the same in any other Python environment, though). If your voice command software can only send keypresses, but cannot execute arbitrary code, then I’m afraid my solution won’t work. Hopefully that is not the case here!

  2. James,
    I checked per your suggestion and confirmed VAC can’t call into a DLL. So I have downloaded python and dragonfly and watched a few tutorials on dragonfly commands in hopes of giving that a try instead.

    So far I have copied your commands above into a .py file and modified them with the path of the two DLLs below. I read that since backslash is used in windows path’s that I need a second backslash as a terminator.

    ————————–
    Traceback (most recent call last):
    File “C:Python27Libsite-packagespythonwinpywinframeworkscriptutils.py”, line 326, in RunScript
    exec codeObject in __main__.__dict__
    File “C:Python27Scriptseyetracker.py”, line 2, in
    ctypes.CDLL(“c:\Program Files (x86)\Tobii\Tobii EyeX\Tobii.EyeX.Client.dll”)
    File “C:Python27libctypes__init__.py”, line 365, in __init__
    self._handle = _dlopen(self._name, mode)
    WindowsError: [Error 127] The specified procedure could not be found
    —————————

    import ctypes
    ctypes.CDLL(“c:\Program Files (x86)\Tobii\Tobii EyeX\Tobii.EyeX.Client.dll”)
    ctypes.CDLL(“c:\Program Files (x86)\Tobii\Tobii EyeX\Tracker.dll”)

    def connect():
    result = tracker_dll.connect()
    print “connect: %d” % result

    def disconnect():
    result = tracker_dll.disconnect()
    print “disconnect: %d” % result

    def get_position():
    x = c_double()
    y = c_double()
    tracker_dll.last_position(byref(x), byref(y))
    return (x.value, y.value)

    def print_position():
    print “(%f, %f)” % get_position()

    def move_to_position():
    position = get_position()
    Mouse(“[%d, %d]” % (max(0, int(position[0])), max(0, int(position[1])))).execute()

    def activate_position():
    tracker_dll.activate()

    Then after I can get this working, I assume I need to map the actions. I found a tutorial for mapping keys like this
    class KeystrokeRule(MappingRule):
    mapping = {
    “enter []” : Key (“enter:%(n)d”),

    And tried to apply it to the function commands:
    class KeystrokeRule(MappingRule):
    mapping = {
    “connect tracker” :def connect(),
    or
    “connect tracker” :tracker_dll.connect(),

    “disconnect tracker” :
    “move” :
    “click” :

    If you have any suggestions on the dll path error and/or mappings once again your help is greatly appreciated.

    Thanks again so much for your help,
    Tim

    1. You are really close to getting this to work. The DLL error bit me first too, there are couple things to double check:
      1) You are right that backslashes would need to be escaped, but note that in my code I use forward slashes which are perfectly legal in Windows Python and are less prone to escaping errors. Try using that.
      2) Try opening the tracker DLL in dependency walker, and it will tell you what dependencies you are missing. I will save you the Google search and tell you you probably need to download this: http://www.microsoft.com/en-us/download/details.aspx?id=30679. I believe you only need the 32-bit version, but it wouldn’t hurt to install both if you have a 64-bit processor.

      Next, you need to use a slightly different syntax for declaring these macros. Here’s what it looks like:
      “(I|eye) connect”: Function(connect),
      “(I|eye) disconnect”: Function(disconnect),
      “(I|eye) print position”: Function(print_position),

      Also, make sure other dragonfly macros in general are working before trying to get this working.

      I’m going to post my dragonfly macros within the next couple weeks (hopefully the next couple days), although they will contain a bunch of stuff you won’t want, so it is still worth figuring this out on your own.

      Thanks for powering through this, I promise it’ll be sweet when you get there!

  3. Hello James, thank you for sharing your solution, I was little disappointed when I found out that EyeX doesn’t handle virtual keystrokes at the moment.

    I cannot work with Dragonfly, since my native language isn’t English, but I have different software that can do various stuff. One of them is running an arbitrary executable file on a voice command.

    Would it be, in principal, possible to create a python script, which would call functions connect() and activate_position() directly?
    So far i didn’t have much luck with this – using your example, I end with error messages like
    result = tracker_dll.connect()
    NameError: global name ‘tracker_dll’ is not defined
    OR
    NameError: global name ‘c_double’ is not defined

    I’m not very skilled in Python, but I think I can get the basics in couple of days – the main question is, whether the solution above can be realized using your DLL.

    1. Hi Michal,

      My first snippet of Python code was broken; please try again now that it is fixed. I guess I must’ve mixed up a couple versions of the code.

      I’m not sure how well your approach will work exactly as described. The trouble is that as soon as the Python process exits, the eye tracker context will be destroyed and disconnect. There are couple ways of solving this. The quick and dirty way would be to create a script that does everything you want in one go (connect, activate position, then quit). The problem is this is likely to be slow due to reconnecting each time. A cleaner approach would be to create a long-running Python server that you start separately one time and then have your executable file just send a request to your server. You would just need to pick some interprocess communication mechanism to send and receive the request.

      Another approach is to make the long-running Python process do the work of listening for virtual keypresses and then triggering actions. It sounds like that’s what most folks really want.

  4. Thank you, James. I think that your last suggestion is the best one. I was able to make it running very fast despite the fact that I’ve never worked with Python before. I also use the function Warp sometimes. Do you think it can be controlled by voice, too?

    1. Glad to hear this worked for you! The warp function is not built into the API, but it shouldn’t be too hard to do this yourself using Python libraries to move the mouse pointer, using the function in my DLL to get the last gaze position. I do this using Dragonfly, which is in turn using Python libraries to handle the mouse movements.

      Here is the full list of actions currently built into the API, for reference:

        Enumeration for all action types.
        An action is a way to interact with the EyeX Engine in addition or instead of the default keybindings that normally
        executes gaze actions.
      
        @field TX_ACTIONTYPE_ACTIVATE:
          Activates an interactor. This corresponds to a click on the activation button
      
        @field TX_ACTIONTYPE_ACTIVATIONMODEON:
          Turns on activation mode. This corresponds to pressing the activation button.
      
        @field TX_ACTIONTYPE_ACTIVATIONMODEOFF:
          Turns off activation mode. This corresponds to releasing the activation button.
      
        @field TX_ACTIONTYPE_PANNINGBEGIN:
          Begins a panning. This corresponds to pressing the panning button.
      
        @field TX_ACTIONTYPE_PANNINGEND:
          Ends a panning. This corresponds to releasing the panning button.
      
        @field TX_ACTIONTYPE_PANNINGSTEP:
          Performs a panning step action. This corresponds to a click on the panning button.
      
        @field TX_ACTIONTYPE_ZOOMIN:
          Not yet supported. 
      
        @field TX_ACTIONTYPE_ZOOMOUT:
          Not yet supported. 
      
        @field TX_ACTIONTYPE_PANNINGTOGGLEHANDSFREE:
          Not yet supported. 
      

      I’m looking forward to the zoom functions. I can add support for this when it is available.

  5. I tried one of the more expensive eye tracking solutions a few years ago, and wasn’t very impressed. The natural flickering of my eyes made it nearly impossible to hone in on my targets. However, I didn’t have control of when the cursor moves like you do, and it never occurred to me to separate the motion of the eye tracker cursor from the motion of the system cursor. I read that in your article (where you have a function for moving the cursor) and thought to myself, what a good design choice. It makes me want to give eye tracking a second try.

    1. Thanks, let me know how it goes! Of course, the one downside of separating these actions is that it is slower. But usually I use eye tracking as a last resort, so I’m just happy to have something (and it’s certainly faster than using “mouse grid”).

    1. Yes, I’m using the Tobii EyeX eye tracker, which is $140. I’ve got a short comparison of options in the second paragraph of my post.

      1. I’m surprised it’s that cheap; I’d heard I trackers were in the $1000 range.
        Do you have to calibrate it each time you sit down?

        1. One time calibration is all that is necessary. The accuracy is so-so, though.

          Most trackers are way more expensive, even from the same company. I got a promotional email from Tobii encouraging me to step up to their next best eye tracker… at only 20x the price 🙂

  6. Just wanted to shout out, Thanks! For writing this up and making that DLL wrapper. It took me a while to get it working with my limited experience, but now I can voice warp too!

    Hm. This is unrelated maybe, but sometimes I’m able to use the Tobii EyeX – Pointer Interaction – Warp on mouse movement, to move scroll bars and windows when I touch the mouse, when usually it does not. But, I haven’t isolated what causes it to start working… I find it useful actually, and wondered if you’ve run into it? In case it’s something the dll accidently causes.

    1. Thanks for the shout out 🙂

      I haven’t ever noticed mouse warping. I’d be really surprised if my DLL caused that; it shouldn’t. If you find it useful though, I suppose you can just enable it in the settings permanently.

Leave a Reply

Your email address will not be published. Required fields are marked *

Markdown is supported. Make sure raw < and > are wrapped in code blocks. You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>