Comparative Analysis of Research and Consumer Wearable Devices in Everyday Activities

Fabian Georgi, Supervisor: Charlotte Brandebusemeyer

Master's Thesis

Wearables, such as wristbands equipped with sensors to measure physiological
activity, offer the advantages of minimal setup, non-invasiveness, and minimal
obstruction. These characteristics make them appealing for use in various research
fields and everyday settings.


Empatica E4 and Shimmer3 GSR+ are widely recognized as standard research
devices for measuring physiological signals. Meanwhile, the consumer market
is evolving, with devices such as the Google Pixel Watch 2 offering, on paper,
similar capabilities in measuring physiological data. However, no studies have
systematically compared these devices in realistic everyday activities to assess their
measurement accuracy and reliability.


To analyze the performance and reliability of these devices, we designed and
executed a study approved by the ethics committee. To ensure a standardized
procedure, we developed a web application for the study execution and an Android
application for collecting data from the Pixel Watch 2. Physiological data from 20
participants were collected and synchronized across all devices, covering cognitive
load, startle, relaxation, and physical activity conditions. Additionally, participants
provided feedback on device comfort.


The analysis shows that the Empatica E4 and Shimmer3 GSR+ produce closely
aligned physiological measurements, whereas the PixelWatch 2 exhibits significant
deviations. The Shimmer3 GSR+ provided the most accurate data but was rated
the least comfortable due to its obtrusive design. In contrast, the Empatica E4 was
better received in terms of comfort but showed reliability issues in its EDA sensor,
including dead signals and difficulties detecting short-term responses to startle
events. Despite being rated the most comfortable, the Pixel Watch 2 shows significant
deviations in measurements compared to the research devices. Additionally, it
is unsuitable for research applications due to the lack of a standardized method for
accessing the data, insufficient data output from the sensors, and the majority of
the data being uninterpretable.