Prof. Dr.-Ing. Bert Arnrich

Federated Learning


Machine learning algorithms, and especially deep models, benefit significantly from large datasets. While it is not a problem in some application domains to combine datasets from different locations and use a central data store for model training, some other domains such as medicine prohibit this procedure. This is due to people's rights to their personal data defined in the General Data Protection Regulation (GDPR) [1] for EU citizens. The federated learning paradigm of leaving sensitive data where it was collected and only sharing models between multiple parties thus enables research across institutional borders without violating the patients' rights.

Security Concepts of Federated Learning

Although federated learning improves data privacy for its participants, there is still the possibility to infer sensitive data from the transmitted model updates, which is called reconstruction attack [3][4]. To prevent those types of attacks, a number of defence concepts are routinely employed in federated learning systems.

Differential Privacy

This method was developed in the data science and database domain and describes the introduction of noise into a system which prevents conclusions about particular single samples in a database by repeatedly querying it [5]. It has been transferred to the federated learning domain to cloud the impact of client-specific data in finding the updated model weights. Specificially, participants add mostly Gaussian noise which has been scaled for their local data to their model updates. Repeated selection of the same client for training increases the so-called privacy spending which has to remain below a certain threshold in order to guarantee data privacy. 

(Additive) Homomorphic Encryption

Another layer of security can be added by further encrypting the model updates. There exist some encryption schemes, such as the Paillier Encryption [6], that enable mathematically correct computation with encrypted values. These are called homomorphic encryption (HE) schemes, and additive HE is a subset thereof, only guaranteeing valid addition of cyphertext and in consequence also multiplication between unencrypted values and encrypted ones, such that the server can still aggregate model weights and clients can decrypt and use the actual model.

Secure Multi-Party Computation

The third security concept relies on clients working together to pre-aggregate their model updates, so that the server or other clients cannot distinguish between the impact their datasets have on the updates. Of course, it has to be ensured, that also the collaborating parties cannot invade each others privacy. As a solution, secure multi-party computation schemes are a combination of encryption, random value addition and back–and-forth transmissions that build a final defence against reconstruction attacks.

Current Research

Generating Synthetic Medical Data

A major use of federated learning is the (virtual) aggregation of distributed datasets in order to have enough data to train deep networks. This could possibly be circumvented by generating synthetic data, which is then not private, personal data in the sense that it does not fit any particular individual, and using this dataset as basis for model training. Generative models such as Variational Autoencoders (VAEs) [7] or Generative Adversarial Nets (GANs) [8] have shown promise in creating high quality image data from noise. Utilising the federated learning framework, it is possible to train these models without violating medical data privacy.

My focus so far is on GAN models, which consist of two networks working against each other. The generator takes a noise vector as input and generates an output in the data space. The discriminator receives this generated sample and actual data and produces a probability of the presented sample coming from the real dataset. During training both components participate in a min-max game, where the discriminator gets better at identifying real samples, and the generator improves its capability of generating data that is mistaken as being real. Formally this is formulated as $$\min_G\max_D V(G, D) = \mathbb{E}_{\mathbf{x}\sim p_{data}(\mathbf{x})}[\log D(\mathbf{x})] + \mathbb{E}_{\mathbf{z}\sim \mathcal{N}(0,1)}[\log (1-D(G(\mathbf{z}))]$$

In practice, I am using the optimised formulation GANs, called Wasserstein GAN with Gradient Penalty (WGAN-GP) [9], which has better properties during training. Regular GANs can fall into cases of vanishing gradients, which halts training progress due to gradients close to zero. Another problem of GAN training is mode collapse, meaning the generator is not able to generate samples with high variance, such that the output looks almost the same for any noise input. The WGAN-GP circumvents these issues by changing the components' loss functions and restructuring the discriminator to a so-called critic, where the only difference is the output layer having no activation function (instead of sigmoid). Now a positive output means that the critic has identified a real sample, and a negative output corresponds to a fake sample.

For federated training of the model, there are two approaches. Either both components are locally trained and centrally aggregated like shown by [10]. Alternatively, one can use the fact that the generator training does not require real data, but is only reliant on the critic. Consequently, only the discriminator has to be trained in a federated manner and the generator training can be done solely by the server [11]. Both approaches have not been evaluated against each other so far, which is something I am currently working on. 

The first use case is the generation of high resolution chest xray images. After successful federated training of a WGAN model, it can be further parameterised with a data label to make it possible to generate images showing healthy patients and those of sick patients with different diseases. I opted for image generation, since this is the main application domain of GAN models and works well as a proof of concept, however looking at usability, it is better to generate tabular medical data, so-called Electronic Health Records (EHRs). There already exist large public databases of medical image data, because they are comparably easy to anonymise. EHRs, on the other hand, are highly sensitive and patient-specific. An additional challenge is the interdependence of EHR columns: in order to be plausible, records need to include values that match together. A person cannot be 20 years old and also have been a smoker for 30 years (as a very simplified example of this). Thus, a specific model structure has to be used, that takes these constraints into consideration, like the previously proposed TableGAN [12].

The evaluation of the approach has multiple facets. First and foremost, there needs to be a guarantee that no exact matches between real patient data and generated data exist. This can be done by employing the aforementioned differential privacy method, which includes a trade-off between security and usability of generated samples that will be investigated. On the other hand, the generator has to have sufficient variance in generated samples in order to generated a large enough dataset that can be used for machine learning (not including duplicate samples).


  1. General Data Protection Regulation (GDPR) – Official Legal Text", General Data Protection Regulation (GDPR), 2020. [Online]. Available: https://gdpr-info.eu/. [Accessed: 28- Sep- 2020]
  2. H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüeray Arcas. 2016. Federated Learning of Deep Networks using Model Averaging. CoRR abs/1602.05629 (2016). arXiv:1602.05629 http://arxiv.org/abs/1602.05629

  3. Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. 2018. Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning. CoRR abs/1812.00535 (2018). arXiv:1812.00535 http://arxiv.org/abs/1812.00535

  4. Briland Hitaj, Giuseppe Ateniese, and Fernando Pérez-Cruz. 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. CoRR abs/1702.07464 (2017). arXiv:1702.07464 http://arxiv.org/abs/1702.07464

  5. Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS ’16). ACM, New York, NY, USA, 308–318. https://doi.org/10.1145/2976749.2978318

  6. Pascal Paillier. 1999. Public-key Cryptosystems Based on Composite Degree Residuosity Classes. In Proceedings of the 17th International Conference on Theory and Application of Cryptographic Techniques (Prague, Czech Republic) (EUROCRYPT’99). Springer-Verlag, Berlin, Heidelberg, 223–238. http://dl.acm.org/citation.cfm?id=1756123.1756146

  7. Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., & Carin, L. 2016. Variational autoencoder for deep learning of images, labels and captions. In Advances in neural information processing systems. 2352-2360. https://papers.nips.cc/paper/6528-variational-autoencoder-for-deep-learning-of-images-labels-and-captions.pdf
  8. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672–2680. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

  9. Gulrajani, Ishaan, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of wasserstein gans. In Advances in neural information processing systems. 5767-5777. arXiv:1704.00028 https://arxiv.org/abs/1704.00028

  10. Fan, Chenyou, and Ping Liu. 2020. Federated Generative Adversarial Learning. CoRR abs/2005.03793 (2020). arXiv:2005.03793 https://arxiv.org/abs/2005.03793

  11. Augenstein, Sean, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, and Rajiv Mathews. 2019. Generative models for effective ml on private, decentralized datasets. CoRR abs/1911.06679 (2019). arXiv:1911.06679 https://arxiv.org/abs/1911.06679

  12. Park, Noseong, Mahmoud Mohammadi, Kshitij Gorde, Sushil Jajodia, Hongkyu Park, and Youngmin Kim. 2018. Data synthesis based on generative adversarial networks. CoRR abs/1806.03384 (2018). arXiv:1806.03384 https://arxiv.org/abs/1806.03384