Computational Statistics (Sommersemester 2020)
Dozent:
Prof. Dr. Bernhard Renard
(Data Analytics and Computational Statistics)
,
Jens-Uwe Ulrich
(Data Analytics and Computational Statistics)
,
Henning Schiebenhöfer
(Data Analytics and Computational Statistics)
Allgemeine Information
- Semesterwochenstunden: 4
- ECTS: 6
- Benotet:
Ja
- Einschreibefrist: 06.04.2020 - 22.04.2020
- Lehrform: Vorlesung / Übung
- Belegungsart: Wahlpflichtmodul
- Lehrsprache: Englisch
- Maximale Teilnehmerzahl: 60
Studiengänge, Modulgruppen & Module
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung
- DATA: Data Analytics
- HPI-DATA-K Konzepte und Methoden
- DATA: Data Analytics
- HPI-DATA-T Techniken und Werkzeuge
- DATA: Data Analytics
- HPI-DATA-S Spezialisierung
- Data Engineering
- APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-C Concepts and Methods
- APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-T Technologies and Tools
- APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-S Specialization
- CYAD: Cyber Attack and Defense
- HPI-CYAD-K Konzepte und Methoden
- CYAD: Cyber Attack and Defense
- HPI-CYAD-T Techniken und Werkzeuge
- CYAD: Cyber Attack and Defense
- HPI-CYAD-S Spezialisierung
Beschreibung
In almost all areas of life, large amounts of data are generated, requiring dedicated procedures for data analysis to allow predictions and inference for decision making. Computational statistical methods have evolved to cope with challenges arising from large datasets that are not tractable with traditional approaches, e.g. when the number of possible parameters of a model exceeds the number of observations. At the same time, this wealth of data allows replacing distributional assumptions with data-driven analyses.
In this course, we will cover statistical summary of data, hypothesis testing, regression as well as statistical learning approaches with focus on clustering and classification. We will contrast traditional frequentist approaches for these tasks with non-parametric, computational more intensive alternatives and Bayesian approaches.
The lecture will be accompanied by regularly scheduled exercises, which focus on applying the covered method to real-life data from different areas of life. Basic knowledge of R programming language or Python are prerequisite to successfully complete the exercises. For those students who are not familiar with any of these two languages, an introduction to R will be provided in the first excercise session.
Learning Objectives:
- Understand concepts and methods of computational statistics
- Ability to statistically evaluate real-world data
- Ability to assess the quality and validity of a statistical method for a given analysis
- Ability to select, implement and apply appropriate statistical methods and algorithms for a given use case
Voraussetzungen
- Fundamentals in calculus and vector analysis (at least comparable to the Mathematik I + II lectures in the ITSE Bachelor at HPI)
- Basic programming knowledge in Python or R or profound skills in another programming language
- Knowledge of English
Literatur
- Hastie, Trevor ; Tibshirani, Robert ; Friedman, Jerome: The elements of statistical learning: data mining, inference and prediction. 2 : Springer, 2009 (https://web.stanford.edu/~hastie/ElemStatLearn/)
- James, Gareth ; Witten, Daniela ; Hastie, Trevor ; Tibshirani, Robert: An Introduction to Statistical Learning -- with Applications in R. 103. New York : Springer, 2013 (Springer Texts in Statistics). - ISBN 978-1-4614-7137-0 (http://faculty.marshall.usc.edu/gareth-james/ISL/)
Lern- und Lehrformen
Exercises will be included into the lecture times when suitable.
Lectures / Exercises will be given in zoom with interactive elements including online quizzes and questions/chats. Call-In details will be provided in time.
Recorded lectures will be made available via teletask.
Due to the COVID-19 pandemic, this course will be offered online. Thus, it is important that all participants enroll by April 22 via our Moodle page.
Leistungserfassung
Final exam covering all lecture materials (70% of final grade)
2 graded mid-semester review exams (each review makes up 15% of the final grade)
Weekly to biweekly exercises (ungraded)
Students have to give a talk about their solution of an excercise at least once in the semester
Termine
Monday 09:15 - 10:45
Wednesday 11:00 - 12:30
Zurück