29-10-2012, 11:38 AM
Invention Awards: The World as a Web Interface
Invention Awards.doc (Size: 109.5 KB / Downloads: 22)
INTRODUCTION
When he's wearing the SixthSense, a combination miniature projector, webcam and notebook computer, Pranav Mistry can snap photos just by making the shape of a frame with his fingers. He can conjure a phone keypad in the palm of his hand and tap the virtual numbers to place a call. The system can even recognize a book in front of the camera, retrieve its Amazon listing from the Web, and project its rating on the cover. Watching Mistry, a graduate student in the Massachusetts Institute of Technology's Media Arts and Sciences program, demonstrate the device is like witnessing a magic show. But he and his adviser, Pattie Maes, a digital-interface specialist at MIT's Media Lab, expect the SixthSense to do a lot more than evoke wonder. Within a few years, they hope, it will let people operate smartphones without touching a button, do instant research on objects around them, and generally offer the kind of enhanced-reality experience that's now confined to science fiction.
SixthSense: How It Works:
A webcam captures video, including specific hand signals that the laptop reads as commands. A mini-projector then displays the relevant content — e-mail, stock charts, photos — on the nearest surface Bland Designs
Maes hit on the idea last October while discussing g-speak, a real-world version of the gesture-controlled interface in the movie Minority Report. She liked the notion of using hand signals to manipulate digital content but wanted something cheaper that you could walk around with, projecting content and interacting with it anywhere you liked. Mistry, nicknamed "Zombie" because of his aversion to sleep, turned out a prototype in just three weeks.
In the News:
The SixthSense can scan newspaper stories and retrieve related video from YouTube or other Web sites, which it projects directly onto the surface of the paper John B. Carnett
Although the system has evolved considerably since then, the basic concept has stuck. A pocket projector and a webcam hang on Mistry's chest, both wired to a laptop in his backpack, and he wears four different-colored marker caps or pieces of tape on his thumbs and index fingers. When he switches on the system, the webcam starts capturing video and streaming it back to the computer. Then the computer's vision algorithms take over. The real brains of this system, this software filters out background imagery, determines x and y coordinates for each cap or tape color in the video frame, and tracks them over time. The computer discerns which colors are moving which way, so it can follow freehand gestures. These, in turn, trigger various functions.