Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Human Motion Analysis Using 3D Cameras

Chair for Digital Health - Connected Healthcare
Hasso Plattner Institute

Office: Campus III Building G2, Room G-2.1.20
Tel.: +49 331 5509-4853
Email: Justin.albert(at)hpi.de
Links: Homepage

Supervisor: Prof. Dr. Bert Arnrich

Starting Date: 01.10.2019


In my research, I focus on human motion analysis using primarily 3D cameras but also other sensor modalities such as Inertial Measurement Units and Electrocardiography. The projects range from using a low-cost 3D camera for gait analysis to predicting subjective exertion in strength training. In the following sections, I want to give an overview of the past projects of the last year and my current research. 

Current Work: Prediction of Subjective Exertion in Resistance Training


Quantifying load during physical activity has been of high interest to the research community. For athletes, it is desirable to optimize their exercises to align the applied training load most closely with the value desired by the training plan. Too much load induces a decrease in force production ability and increases the risk of injuries. Exercise load, e.g., during rehabilitation or recreational sports, is also important to avoid injuries for the general population. Training load can be quantified utilizing internal and external measures. External measures include, e.g., the distance traveled, the travel speed, or the lifted weight. Internal load is often measured as a rating of perceived exertion (RPE), which specifies how exhausting an exercise was for a specific person by reporting a single value on a scale. A standard RPE scale is the so-called Borg scale, which ranges from 6 (not exhausting) to 20 (extremely exhausting) [1]. Retrieving such a rating is quickly done by giving subjects a scale to mark their exhaustion a short time after the load concludes. Given this, we aim to build a system that can automatically predict RPE values based on sensor measurements. We hypothesize that such a system could warn users when a significant training overload is experienced to avoid fatigue injuries. In this initial project, we utilize multiple 3D cameras for motion tracking and methods from machine learning to predict subjective RPE values.

Study Setup

For this project, we aim for the maximum effect on exertion. Therefore, the squat exercise was chosen as it involves large muscle groups. The exercises were performed on a so-called flywheel machine. A flywheel training machine does not use a weight that is accelerated downwards by gravity. Instead, all power generated by the subject standing up is stored in a flywheel, transmitted by a belt. This belt is connected to the participant via a hip harness and wrapped around a transmission shaft fixed to the flywheel. Thus, when the participant stands up, he unwraps the belt from the shaft, spinning up the flywheel. Standing up is the concentric movement in a squat. The belt wraps back around the transmission shaft at the topmost position because the flywheel continues to spin. Thus, during the downwards movement, the participant has to deaccelerate the flywheel back down in the eccentric movement. Finally, the subject will again be in a squatting position, as shown in the following figure. In total, N=21 subjects have participated in our study, performing a specific protocol consisting of several sets with 12 repetitions in each set.  

Data Analysis

We used two Microsoft Azure Kinect 3D cameras to capture the participants during the experiment. Both cameras were placed at a 45-degree angle, pointing to the subject. In order to obtain one final skeleton, the skeleton from each camera must be integrated into one. The skeleton fusion was achieved by an external camera calibration using calibration patterns. An example sequence of a fused skeleton is shown in the video below. Afterward, signal processing methods must be applied to filter the kinematic data and to remove outliers. In the initial phase of this project, the aim is to explore and analyze the recorded kinematic data and manually craft feature sets. These include various skeleton features, such as relative joint positions, joint angles, and joint angle velocities. After obtaining an extensive feature set, we eliminate meaningless features from the feature set using various feature elimination methods. Subsequently, statistical features such as mean, standard deviation, and median are calculated on the previously mentioned skeleton features. To predict fatigue during squats, the focus, for now, is on conventional machine learning rather than advanced methods from deep learning. We utilize Random Forests, Gradient Boosting Regression, K-NN regression, and multi-layer perceptron (MLP) to predict the subjective value from the Borg scale. 

Data Augmentation of Kinematic Time-Series from Rehabilitation Exercises using GANs

Neurological diseases such as Parkinson's disease or stroke are common, severe conditions in modern society. Usually, physicians or experts assess the progress of these or other neurological diseases in the hospital. Hence, their decisions can suffer from a subjective bias. Furthermore, in many healthcare systems, patients are dismissed from the rehabilitation program early, forcing them to continue the training program at home without an expert's supervision. Nowadays, exercise recognition systems are developed which can evaluate a user's movement. These systems could potentially support physicians with an objective decision-making process or automatically evaluate the exercises performed alone at home. Training such a machine learning system requires large amounts of representative data to achieve good results, especially for deep learning-based approaches. In the field of Human Activity Recognition (HAR), large and diverse datasets are publicly available. However, the collection of medical datasets is challenging as access to patients is restricted. Also, detailed knowledge of medical experts and equipment is needed to collect the data and obtain ground truth labels. Especially for studies including a healthy control group, the potentially limited access to patients leads to unbalanced datasets with most data points belonging to the healthy subjects. To overcome these challenges, a common strategy for increasing the size of a collected dataset is dataset augmentation or the synthesis of entirely new datasets with artificial examples. We have developed a method to generate long-term synthetic sequences of human motion data for a given class utilizing a Generative Adversarial Network (GAN) to tackle this issue.

The here-developed network can produce realistic-looking repetitions of a specific exercise over a longer period. Our network architecture is inspired and builds upon the Human-Pose-GAN (HP-GAN) model [2]. The architecture consists of an encoder and a decoder network and takes 10 prior poses from an arbitrary sequence. From there, it aims to predict 20 new output poses of the sequence. By recursively inferring the network, long data sequences can be created. The approach's usefulness was demonstrated by balancing the KIMORE (KInematic Assessment of MOvement and Clinical Scores for Remote Monitoring of Physical REhabilitation) dataset [3]. In this dataset, patient classes are underrepresented compared to the healthy control group. We have trained and focused our approach specifically on the squat exercise performed by Parkinson's disease and stroke patients and healthy persons. For evaluation, a classification network was trained that aims to identify stroke from Parkinson's patients. By balancing the dataset using our method, the classification accuracy was increased by 11 percentage points for a three-class classification of stroke and Parkinson's disease patients and healthy subjects. The method and results were published at the IEEE COINS conference in September 2021. The video below shows generated skeleton data for a handraise exercise using our algorithm. Shown are ground-truth data, generated data as well as two error cases. 

Evaluation of the Pose Tracking Accuracy of the Microsoft Kinect v2 and Azure Kinect Camera

Microsoft released the first version of its Kinect camera in 2010 as a gaming controller for the Xbox gaming console. It can track certain joint positions of users in 3D. It combines an RGB camera with a 3D depth sensor. Since the second camera generation, the Time-of-Flight (ToF) principle has been used for depth estimation. This method estimates the depth by emitting IR-light into the scene and measuring the time until it gets reflected and returns to the sensor. For 3D motion tracking, Kinect v2 used randomized decision forests to estimate the joint locations, as described in [4]. In 2019, a new Kinect generation, Azure Kinect, was released where the focus is shifting away from games towards industrial applications. The skeleton tracking algorithm utilizes deep learning with Convolutional Neural Networks (CNN) to estimate the human poses. The research community has used the Kinect camera for medical and biomedical applications and analysis for many years. 

In this project, we utilized the latest Microsoft Azure Kinect camera for gait analysis. Gait analysis is an essential tool for the early detection of neurological diseases and assessing the risk of falling in elderly people. More specifically, we evaluated the pose tracking performance of the Azure Kinect camera compared to its predecessor Kinect v2 in treadmill walking. We have used a Vicon multi-camera motion capturing system and the 39 marker Plug-in Gait model as the gold standard. Five young and healthy subjects walked on a treadmill at three different velocities. Data were recorded simultaneously with all three camera systems. To compare the spatial agreement of joint locations, we have developed an external camera calibration to spatially align the 3D skeleton data from both Kinect cameras and the Vicon system. Specific gait parameters were calculated for all three camera systems, including step length, step time, step width, and stride time. The results showed that the improved hardware and the motion tracking algorithm of the Azure Kinect camera led to significantly higher accuracy of the spatial gait parameters than the predecessor Kinect v2. At the same time, no significant differences were found between the temporal parameters. The results of this study were published in the MDPI sensors journal. 


  1. Gunnar Borg. “Perceived exertion as an indicator of somatic stress.” In: Scandinavian journal of rehabilitation medicine (1970).

  2. E. Barsoum, J. Kender, and Z. Liu, “HP-GAN: Probabilistic 3D Human Motion Prediction via GAN,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018.

  3. M. Capecci, M. G. Ceravolo, F. Ferracuti, S. Iarlori, A. Monteri`u, L. Romeo, and F. Verdini, “The kimore dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 27, no. 7, pp. 1436–1448, July 2019.
  4. Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. Real-time Human Pose Recognition in Parts from Single Depth Images 2011. pp. 1297–1304. doi:10.1109/CVPR.2011.5995316.



  • Using Machine Learning to... - Download
    Using Machine Learning to Predict Perceived Exertion During Resistance Training With Wearable Heart Rate and Movement Sensors. Albert, Justin; Herdick, Arne; Brahms, Clemens Markus; Granacher, Urs; Arnrich, Bert (2021).
  • Data Augmentation of Kine... - Download
    Data Augmentation of Kinematic Time-Series From Rehabilitation Exercises Using GANs. Albert, Justin; Glöckner, Pawel; Pfitzner, Bjarne; Arnrich, Bert (2021). 1–6.


  • Will You Be My Quarantine... - Download
    Will You Be My Quarantine: A Computer Vision and Inertial Sensor Based Home Exercise System. Albert, Justin; Zhou, Lin; Gloeckner, Pawel; Trautmann, Justin; Ihde, Lisa; Eilers, Justus; Kamal, Mohammed; Arnrich, Bert (2020). (Vol. 14)
  • Evaluation of the Pose Tr... - Download
    Evaluation of the Pose Tracking Performance of the Azure Kinect and Kinect v2 for Gait Analysis in Comparison with a Gold Standard: A Pilot Study. Albert, Justin; Owolabi, Victor; Gebel, Arnd; Brahms, Markus Clemens; Granacher, Urs; Arnrich, Bert in MDPI Sensors (2020). 20(18)


  • Geometric Algebra Computi... - Download
    Geometric Algebra Computing for Heterogeneous Systems. Hildenbrand, D.; Albert, Justin; Charrier, P.; Steinmetz, C. in Advances in Applied Clifford Algebras (2017). 27 599–620.