PI: Dr. Mohamed Elsayed
Team members: Dr. Marwan Torky
Funding Agency: Microsoft ATLc
Duration: 12 months
Project Abstract
One of the most challenging problems in computer vision research is visual recognition and its related tasks, such as object classification, localization, activity, scene and event classification, etc. However, such challenging problems have benefited a lot from recent advances in sensing technologies, such as cheap RGBD sensors (e.g. Microsoft Kinect). The merit of using depth sensors is straight forward. While the original image capturing is a projection of the 3-D world into a 2-D image plane (which results in ambiguities), the RGBD data aims to reduce ambiguity by giving an easily-calibrated depth data to the captured pixels in the 2-D image.
Currently, many indoor applications such as 3-D reconstruction of indoor scenes, robot navigation, and activity recognition have started using Kinect-like sensory data. In this research project, we address the problems of action recognition (sign language, in particular) using RGBD data. The particular aims are the following: First, we collect a dataset for isolated-word sign language using a Kinect sensor. Second, we develop and test algorithms that apply machine learning on the collected dataset to recognize sign language from the user’s skeleton movement and hand and face shapes.