Privacy Preserving Outlier Detection (Wintersemester 2023/2024)
Lecturer:
Dr. Anne Kayem
(Internet-Technologien und -Systeme)
General Information
- Weekly Hours: 4
- Credits: 6
- Graded:
yes
- Enrolment Deadline: 01.10.2023 - 31.10.2023
- Examination time §9 (4) BAMA-O: 26.10.2023
- Teaching Form: Seminar / Exercise
- Enrolment Type: Compulsory Elective Module
- Course Language: English
- Maximum number of participants: 24
Programs, Module Groups & Modules
- DAPP: Data Applications
- HPI-DAPP-K Konzepte und Werkzeuge
- DAPP: Data Applications
- HPI-DAPP-T Techniken und Werkzeuge
- DAPP: Data Applications
- HPI-DAPP-S Spezialisierung
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-C Concepts and Methods
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-T Technologies and Tools
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-S Specialization
- HDAS: Health Data Security
- HPI-HDAS-C Concepts and Methods
- HDAS: Health Data Security
- HPI-HDAS-T Technologies and Methods
- HDAS: Health Data Security
- HPI-HDAS-S Specialization
- Cybersecurity
- HPI-CS-PE Data Protection & Ethics
- MODA: Models and Algorithms
- HPI-MODA-C Concepts and Methods
- MODA: Models and Algorithms
- HPI-MODA-T Technologies and Tools
- MODA: Models and Algorithms
- HPI-MODA-S Specialization
Description
In an increasingly interconnected world in which almost every device is essentially both a data generator and collector, composing large datasets of complex personal information is ever more easy to achieve. This is in spite of the fact that privacy legislation such as GDPR provides measures to prohibit the collection and storage of personal data without explicit user consent. A further point of alarm is the growing number of reports in popular media on de-anonymization incidents that have paved the way for related security subversion incidents such as leaks of personal data.
In this seminar, we study several anonymized datasets in effort to understand why and how de-anonymizations occur. Specifically, we focus on designing reverse-clustering algorithms to discover outlier data points, and de- termine how these can be used either individually or in combination with auxiliary data, to de-anonymize data points within the original dataset. As a final point, we will discuss the properties of the outlier data points in terms of how they enabled the de-anonymizations and what possible counter-measures to apply.
==========
Topics (Brief):
==========
• Outlier Detection
• Distance-Based Outlier Detection
• Clustering-Based Outlier Detection
• Ensemble Methods
• Other outlier detection approaches
• Considerations with respect to data models/types...
Requirements
There are no pre-requisites for this course, but some knowledge of machine learning/data science might be helpful.
Literature
Will be provided on a per lecture basis.
Learning
At the end of this seminar you should have some insight into the research field of outlier detection (or also sometimes referred to as anomaly detection) algorithms for supporting the generation of privacy preserving datasets. You will also have studied the conceptual foundations of these algorithms and through the project work applied these learnings to some examples of datasets drawn from real-life practical application areas (e.g. data from COVID tracing apps).
Examination
Grades will be based on a suite of five (5) assignments. To obtain a grade for the seminar, you must complete and submit ALL five (5) assignments, for a total of 100%. Submissions must be made individually. Late submissions will incur a penalty of up to 5% of the assignment grade.
Exam Type | Start Date | End Date | Grade % |
Assignment #1 | 26.10.2023 | 14.11.2023 | 10% |
Assignment #2 | 14.11.2023 | 28.11.2023 | 10% |
Assignment #3 | 28.11.2023 | 12.01.2024 | 30% |
Assignment #4 | 16.01.2024 | 01.02.2024 | 20% |
Assignment #5 | 01.02.2024 | 02.03.2024 | 30% |
Dates
For the duration of the Winter semester (16.10.2023 - 09.02.2024), lectures will hold as follows:
• Tuesdays, 09.15 - 10.45 (Location: Online)
• Thursdays, 13.30 - 15.00 (Location: Online)
Zoom Credentials will be made available upon registration for the course on HPI Moodle.
To register, please do the following:
- Search for Course either under: PPOD-2023 or Privacy Preserving Outlier Detection 2023
- On prompt please provide the enrolment key: PPOD-2023
Zurück