Mike Rooney

programming and philosophy

Eye Tracking and UI Framework / Window Manager Integration

Eye tracking is the technique of watching the user’s eyes with a camera and figuring out where on the screen he or she is looking. While some computer users with disabilities use this technology as their primary input device, it hasn’t become very popular. However I think that with webcams being integrated into the majority of new laptops, and multi-core processors with some cycles to spare for image processing becoming ubiquitous, eye tracking deserves to become more popular.

I don’t believe the technology is accurate enough (yet) to replace your mouse, but it could still improve usability in a few ways. Imagine having the equivalent of onMouseIn and onMouseOut events on widgets when writing a user interface, but for where the user is looking instead. Applications could leverage onLookIn and onLookOut events at the widget level and open a whole new realm of functionality and usability. Videos and games could pause themselves when you look away, or bring up certain on-screen displays when you look at certain corners of the screen. If an application sees you are studying a certain element for a period of time, it may ask if you need help.

It would also be interesting to see eye tracking leveraged on the window manager level. Most people use focus follows click to focus windows, and some enjoy focus follows mouse, but imagine focus follows (eye) focus! Using multiple monitors would become much easier if your keyboard input was automatically directed to the application, or even specific field, which you were looking at. Eye gestures, like mouse gestures, could be potentially useful as well, such as glancing off-screen to move to the virtual desktop in that direction.

Apple and Linux both seem to be in a good position to implement something like this. Apple has control of both the hardware and the software including the OS, and has been integrating cameras in laptops for a while. As a result they are in a great position to pioneer this field and really have something unique to bring to the table in terms of a completely new user experience. However in the open-source world, Linux is also in a decent spot to do this as the UI frameworks and window managers are all patchable and most webcams are supported out of the box.

Eye tracking has the potential to enable us to use computers in ways that were previously impossible. What are your thoughts on eye tracking? Does it have a future in the computing world and where can it take us? And how long will it be before we will take this technology for granted? :)


Yeah, I had the exact same experience. At work I have a dual monitor setup, and running a virtual machine in one of them all the time. Because of this, sometimes you will have windows looking focused on both screens.

Focus follows eyes would be nice there, even if it was just window focus (as opposed to the specific control inside a window)
I remember trying to figure out a good way to do this back in 2002 when I was dealing with my first dualmon setup and I kept having that whole “I look at a different monitor and I start typing into the window I’m looking at but it doesn’t have focus” problem.

At the time all I could think about was doing it via IR emitters on the arms of my glasses and a receptor grid to figure out which monitor was being looked at.

…But then I learned all of the defensive responses to using multimon and the desire disappeared. I really haven’t thought of this in a while, but I remember getting excited by the prospect of a mouse-free GUI world :(
On first glance the focus-follows-eyes sounds intriguing, but I’m afraid I might not always look at what I am typing. I might look at a physical sheet off the computer screen, or I might watch/read one window and type my thoughts in the other.

I am not 100% sure that I really do all of this, but it would be worth investigating before spending time on any implementation.

I think you need at least something like this: http://www.techsmith.com/morae.asp

I use Camtasia at the office (same company) and don’t know if Morae already does all you need, but you will need smething that can track eye focus on a widget level and correlate it to screen actions, otherwise you will go mad trying to align eyes in the face cam to the screen capture.
I think this would be good for anonymous webcam. Follow eyes, cheekbones, nose, chin, and replace with scalable graphics. Send along with poser-type information for skin tone, headwear, hair color.

It would be fast, give an appearance of web camming, and give the cammer good options on how they present themselves.