Head Pose Estimation by ISL
Figure 1: Capture-Projection Synchronization Strategy. |
Introduction |
This work describes a method of determining all the 6 DOFs of head pose in space by the use of an imperceptible structured light system. We employ projection that appears as white light (and thus is less annoying) to humans, but underneath actually embeds coding patterns that can facilitate 3D reconstruction. The method tracks 3D salient feature points on human face accurately without the need of going through prior training. Firstly, through an elaborate pattern projection strategy and camera-projector synchronization, a pattern-illuminated image and the corresponding scene-texture image are captured. Then in the point cloud generated by structured light sensing, the facial feature points in the scene-texture image - all localized by AAM - will have their 3D positions interpolated. Correspondences between such facial features in 3D with those associated with the previous or reference image frame can then be constructed. Finally, from such point pairs in 3D, the head orientation and translation are determined. Extensive experiments show that the proposed method is effective, accurate, and fast in determining the 6-DOFs of the head pose, making it suitable for use in real-time applications. |
Method | |
** Pattern Projection Strategy for Imperceptible Structured Light Sensing |
|
To describe the strategy of pattern projection, one capture-projection cycle is illustrated in Fig. 1. To achieve imperceptible structured light projection, the frequency of projection must exceed the flicker fusion threshold, which is 75Hz for most of the people. | |
** Facial Feature Localization |
|
-- Locating 2D Positions of Key Facial Feature
Points in Scene-texture Image -- Determining 3D Positions of Grid Points in Pattern-illuminated Image -- Inferring 3D Positions of Key Facial Features
Figure 2: 3D facial feature
landmarking by interpolation: (Left) Feature points in the scene-texture
|
|
** 6-DOF Head Pose Estimation |
|
Head orientation and translation are estimated by SVD of a correlation matrix that is generated from 3D point pairs between consecutive frames. |
Some Results |
|
Reference |
[1] J. Dai and R. Chung. Head
Pose Estimation by Imperceptible Structured Light Sensing. In Proc.
of IEEE International Conference on Robotics and Automation, pages
1646-1651, 2011. |