Study Setup
We have recorded a dataset of 16 subjects (8 female, 8 male) walking on a treadmill at three different velocities. We recorded the subjects using a 12 MP color camera and a marker-based motion capture system (Vicon). We used the 39 full-body marker Plug-in Gait model. The Vicon system sampled data at 100 Hz, while the RGB camera recorded at 30 fps. We tried different state-of-the-art models, including the GAST-Net [1], MediaPipe [2], or VideoPose3D [3], to estimate the 3D coordinates of humans based on 2D images. After the extraction, we apply signal processing methods such as filtering and temporal and spatial alignment of the skeleton data. We evaluated different aspects, including the spatial agreement of joint locations and gait parameters (step length, step time, step width, and stride time). The Figure below shows an early result of the step length parameter calculated using a model based on VideoPose3D and the Vicon system. The X-axis shows reference values from the Vicon system, and the Y-axis shows the gait parameters from the pose estimator. The Figure indicates that the tracking performance leaves room for improvement.