We propose a novel hierarchical model of human dynamics for view independent tracking of a human figure in monocular video sequences. The model is trained using real data from a collection of people. The top of the hierarchy contains information about the whole body. The lower levels of the hierarchy contain more detailed information about possible poses of some subpart of the body. In this article we describe our model and present experiments that show we can recover 3D human figures from 2D images in a view independent manner, and also track people the system has not been trained on. (C) 2002 Elsevier Science B.V. All rights reserved.