Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI

Social Media Mining (Wintersemester 2022/2023)

Dozent: Prof. Dr. Christoph Meinel (Internet-Technologien und -Systeme) , M.Sc. Ali Alhosseini (Internet-Technologien und -Systeme)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.10.2022 - 09.11.2022
  • Prüfungszeitpunkt §9 (4) BAMA-O: 01.03.2023
  • Lehrform: Seminar / Projekt
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 12

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
Data Engineering MA
Digital Health MA
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-C Concepts and Methods
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-T Technologies and Tools
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-S Specialization
Software Systems Engineering MA


Individuals have an intrinsic need to express themselves to other humans within a given community by sharing their experiencesthoughtsactions, and opinions. As a means, they mostly prefer to use modern online social media platforms such as TwitterFacebookpersonal blogs, and Reddit. Users of these social networks are interacting with each other by drafting their own statuses updates, publishing photos and giving likes leaving a huge amount of data behind them to be analyzed. Recently, researchers started to explore the shared social media data to create solutions based on different types of machine learning algorithms in order to finally integrate them into humans' life. One of the areas that we study here in HPI is how to predict users’ Big 5 Personality Traits such as agreeableness, conscientiousness, extraversion, neuroticism and openness to experience from social media streams. We also develop multiple types of neural network models to find identify and classify bots online.

In this seminar, we will focus on understanding and analyzing social media streams from different platforms such as (Facebook, Twitter, Instagram, Blogs, Reddit, Linked In, Xing) to reveal potential relationships and to visualize the dynamics of the social platform. Several data mining technologies will be used within the selected topics in this seminar. Our research group is working extensively at cross-domain studies as psychology in big data streams, individuals personality prediction from his/her posts, likes, friends network & profile pictures. Also, we are working on detecting and profiling bots at various social platforms using classical and advanced machine learning algorithms. Definitely, we are open to any new research ideas that fit under the umbrella of social media analytics. 


Practical knowledge in:

  • Good programming skills in Python/R
  • Desire to learn classical or advanced machine learning algorithms
  • Interest in data mining and social media networks
  • Internet Basics and Conceptes
  • SQL or non-SQL Databases
  • Good verbal and written communication skills


Course Materials:


The final evaluation will be based on:

  • Topic Presentation 10%
  • Idea Presentation 15%
  • Final Presentation 25%
  • Report, 12-18p LNCS 30%
  • Implementation and integration 20%


12.10.2022 Course Introduction (First day of class)

Introduction slides


09.11.2022    Submission of topics & Topic Assignment

16.11.2022  Topic Presentation (The seminar will be held in H2.57!)

30.11.2022    Idea Presentation

01.03.2023  Final Presentations

15.03.2023    Code Submission

22.03.2023  Documentation Submission