Prof. Dr.-Ing. Bert Arnrich

Privacy-Preserving Classification of X-Ray Images in a Federated Learning Setting

Master's Thesis

Joceline Ziegler, Supervisor: Bjarne Pfitzner


Federated learning (FL) gains increasing attention as a way to train machine learning (ML) models on sensitive data in a privacy-preserving manner. In an FL setting, holders of sensitive data, such as hospitals, can make their data available for ML without sharing it with other parties which would inherently increase the risk of privacy breach. In a common setup, a central server first distributes the initial model to several clients holding the data (Fig. 1). The clients train the model locally and send the resulting model updates back to the server, where the updates are aggregated. The modified global model can then again be distributed to the clients for another round of training.

Figure 1: A commonly used federated learning setup [1]

This procedure is especially promising for medical use cases. Medical datasets are usually scattered across multiple medical institutions and underlie rigorous privacy constraints of both ethical and regulatory nature [2]. With FL, such data can be leveraged for creating generalizable ML applications without the need for sharing. However, while FL provides a basic level of privacy by the principle of data minimization, it cannot by itself formally guarantee privacy [3]. It has been shown that input data can successfully be reconstructed from model parameters such as gradients or gradient updates [4, 5]. In addition to the threat of data reconstruction, attacks disclosing the presence of a specific data sample (membership inference attack) or data property (property inference attack) in the dataset used for training imply a serious privacy risk for individual contributors [6]. Measures to prevent the success of such attacks have to be taken and are subject to ongoing research. Differential privacy (DP) is a concept which is actively explored in this area. Intuitively, its application guarantees up to a certain degree that the impact of a single data point or a subset of the data on the overall outcome is limited and therefore no or little information can be derived about it [6]. However, DP is known to potentially decrease the utility of the model, e.g. its accuracy, resulting in a trade-off between utility and privacy. 


This thesis aims at evaluating the application of DP to the classification of chest X-ray data in a simulated FL environment. The data will consist of different open source datasets of labeled chest X-rays, including for example CheXpert [7], the NIH chest X-ray dataset [8], the MIMICCXR database [9], or the Mendeley dataset [10]. 

The first step is to implement a suitable FL baseline model. Then, appropriate measures to achieve local DP will be selected and added to the learning process. For evaluation, the DP-enabled model will be compared with the baseline model in terms of accuracy regarding the classification task, as well as in terms of meeting the privacy goal defined above considering the privacy budget and the success of simulated reconstruction attacks. Because several different datasets with potentially varying characteristics will be used for the experiments, it can be evaluated to what extent the respective DP measures are suitable for application to these individual datasets. Furthermore, the relationship between accuracy and privacy in the DP model will be explored by adjusting parameters which increase or decrease the degree of privacy. In accordance with prior work it is expected to see a trade-off between performance and privacy. The contribution of this work is to explore this trade-off for the use case of chest X-ray classification and derive insights about the benefits and drawbacks of using this method for privacy-preservation.


[1] Holger R. Roth et al. “Federated Learning for Breast Density Classification: A Real-World Implementation”. In: Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning. 2020, pp. 181–191.

[2] Nicola Rieke et al. “The future of digital health with federated learning”. In: npj Digital Medicine 3 (Dec. 2020). DOI: 10.1038/s41746-020-00323-1.

[3] Peter Kairouz et al. Advances and Open Problems in Federated Learning. 2021. arXiv: 1912. 04977 [cs.LG].

[4] Jonas Geiping et al. “Inverting Gradients - How easy is it to break privacy in federated learning?” In: Advances in Neural Information Processing Systems. Vol. 33. 2020, pp. 16937– 16947.

[5] Ligeng Zhu, Zhijian Liu, and Song Han. “Deep Leakage from Gradients”. In: Advances in Neural Information Processing Systems. 2019.

[6] Mohammad Naseri, Jamie Hayes, and Emiliano De Cristofaro. Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy. 2020. arXiv: 2009.03561 [cs.CR].

[7] Jeremy Irvin et al. “CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison”. In: 33rd AAAI Conference on Artificial Intelligence. 2019, pp. 590–597.

[8] X. Wang et al. “ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases”. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, pp. 3462–3471. DOI: 10.1109/CVPR.2017.369.

[9] Alistair E. W. Johnson et al. “MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports”. In: Scientific Data 6 (2019). DOI: 10.1038/s41597- 019-0322-0.

[10] Daniel Kermany, Kang Zhang, and Michael Goldbaum. Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification. Version 2. Mendeley Data, 2018. DOI: 10.17632/rscbjbr9sj.2.