Inside-Out: First Person Vision for Personalized Intelligence


Jacobs Hall, Room 2512, Jacobs School of Engineering, 9500 Gilman Dr, La Jolla, San Diego, California 92093

Sponsored By:
Professor Nuno Vasconcelos

Jianbo Shi
University of Pennsylvania
Jianbo Shi, Ph.D


A first person camera placed at the person's head captures candid moments in our life, providing detailed visual data of how we interact with people and objects. It reveals our attention, intention and momentary visual sensorimotor behaviors. With the first person vision, can we build a computational model for personalized intelligence that predicts what we see and act by "putting yourself in her/his shoes"?

We provide three examples. (1) At physical level, we predict the wearer's intent in a form of force and torque that control the movements. Our model integrates visual scene semantics, 3D reconstruction, and inverse optimal control to compute the active force (pedaling and braking while biking) and experienced passive force (gravity, air drag, and friction) in a first person sport video. (2) At social scene level, we predict plausible future trajectories in reaction to a first person video. The predicted paths avoid obstacles, move between people, even turn around a corner into invisible space behind objects. (3) At object level, we study the holistic correlation of visual attention with motor action by introducing "action-objects" associated with seeing and touching actions. Such action-objects exhibit characteristic 3D spatial distance and orientation with respect to the person. We demonstrate that we can predict momentary visual attention and motor actions without gaze tracking and tactile sensing for first person videos.

This is a joint work with Hyun Soo Park, Gedas Bertasius, and Stella Yu.


Speaker Bio:
Jianbo Shi studied Computer Science and Mathematics as an undergraduate at Cornell University where he received his B.A. in 1994. He received his Ph.D. degree in Computer Science from University of California at Berkeley in 1998. He joined The Robotics Institute at Carnegie Mellon University in 1999 as a research faculty, where he led the Human Identification at Distance (HumanID) project, developing vision techniques for human identification and activity inference. In 2003 he joined University of Pennsylvania where he is currently a Professor of Computer and Information Science. In 2007, he was awarded the Longuet-Higgins Prize for his work on Normalized Cuts. His current research focuses on first person human behavior analysis and image recognition-segmentation. His other research interests include image/video retrieval, 3D vision, and vision based desktop computing. His long-term interests center around a broader area of machine intelligence, he wishes to develop a "visual thinking" module that allows computers not only to understand the environment around us, but also to achieve cognitive abilities such as machine memory and learning.

Julie Moritz