Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI

Social Media Mining (Wintersemester 2021/2022)

Dozent: Prof. Dr. Christoph Meinel (Internet-Technologien und -Systeme) , M.Sc. Ali Alhosseini (Internet-Technologien und -Systeme)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 25.10.2021 -15.11.2021
  • Lehrform: Seminar / Projekt
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 12

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-S Spezialisierung
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-T Techniken und Werkzeuge
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-S Spezialisierung
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
  • IT-Systems Engineering
    • HPI-ITSE-E Entwurf
  • IT-Systems Engineering
    • HPI-ITSE-K Konstruktion
Data Engineering MA
Digital Health MA


UPDATE: the seminar room got changed from building A to (HE .51/52)

Individuals have an intrinsic need to express themselves to other humans within a given community by sharing their experiencesthoughtsactions, and opinions. As a means, they mostly prefer to use modern online social media platforms such as TwitterFacebookpersonal blogs, and Reddit. Users of these social networks are interacting with each other by drafting their own statuses updates, publishing photos and giving likes leaving a huge amount of data behind them to be analyzed. Recently, researchers started to explore the shared social media data to create solutions based on different types of machine learning algorithms in order to finally integrate them into humans' life. One of the areas that we study here in HPI is how to predict users’ Big 5 Personality Traits such as agreeableness, conscientiousness, extraversion, neuroticism and openness to experience from social media streams. We also develop multiple types of neural network models to find identify and classify bots online.

In this seminar, we will focus on understanding and analyzing social media streams from different platforms such as (Facebook, Twitter, Instagram, Blogs, Reddit, Linked In, Xing) to reveal potential relationships and to visualize the dynamics of the social platform. Several data mining technologies will be used within the selected topics in this seminar. Our research group is working extensively at cross-domain studies as psychology in big data streams, individuals personality prediction from his/her posts, likes, friends network & profile pictures. Also, we are working on detecting and profiling bots at various social platforms using classical and advanced machine learning algorithms. Definitely, we are open to any new research ideas that fit under the umbrella of social media analytics. 


Practical knowledge in:

  • Good programming skills in Python/R
  • Desire to learn classical or advanced machine learning algorithms
  • Interest in data mining and social media networks
  • Internet Basics and Conceptes
  • SQL or non-SQL Databases
  • Good verbal and written communication skills

Topics 2021 slides: https://owncloud.hpi.de/s/y0AkCDR4nTcRvZs


Selected publications: 

  • Raad Bin Tareaf, Philipp Berger, Patrick Hennig, Christoph Meinel:Cross-platform personality exploration system for online social networks: Facebook vs. Twitter. Web Intell. 18(1): 35-51 (2020)
  • Raad Bin Tareaf, Seyed Ali Alhosseini, Christoph Meinel:Does Personality Evolve? A Ten-Years Longitudinal Study from Social Media Platforms. ISPA/BDCloud/SocialCom/SustainCom 2020: 1205-1213
  • Seyed Ali Alhosseini, Raad Bin Tareaf, Christoph Meinel:Engaging with Tweets: The Missing Dataset On Social Media. RecSys Challenge 2020: 34-37
  • Raad Bin Tareaf, Seyed Ali Alhosseini, Philipp Berger, Patrick Hennig, Christoph Meinel:Towards Automatic Personality Prediction Using Facebook Likes Metadata. ISKE 2019: 714-719
  • Full Publication list in social media domain:https://dblp.org/pid/213/0620.html


The final evaluation will be based on:

  • Initial implementation / idea presentation, 15%
  • Final presentation, 25%
  • Report, 12-18p LNCS template, 30%
  • Implementation, 15%
  • Integration, 15%
  • Participation in the seminar, paper review (bonus points)


All participating students will work in multiple groups with each group consists of a maximum of 3-4 students to work on a chosen topic. There will be weekly or bi-weekly meetings where the group will give the latest update/progress of the topic. The format of the seminar will be a mixed of on-site at HPI and online where each group can choose which option is suitable for the seminar.

The first meeting will be on the 27th of October 2021 @ (9.15-10:45) am in (HE. 51/52) Room.

-First meetings:

27-10-2021  First week : Topic Presentations (Lecturers will provide a set of varied topics to be investigated during the seminar)

03-11-2021  Second and Third week: Team & Topic Assignment