Hasso-Plattner-Institut
  
Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Open PhD positions

PhD and PostDoc Scholarships

at the HPI research school. The annual application deadline is August 15.

Contact: Felix Naumann

Wissenschaftlicher Mitarbeiter und Doktorand (m/w)

in der Web Science Gruppe innerhalb des Fachbereichs Informationssysteme für eine 3-jährige Vollzeitstelle im Bereich Business Communication Analysis. In Zusammenarbeit mit einem Projektpartner aus der Industrie. Ziel ist die Entwicklung neuer, effizienter und effektiver Methoden zur Extraktion, Analyse und Vorhersage von Geschäftskommunikation auf Basis von E-Mails, Berichten und Verträgen. Zu diesem Zweck sollen Technologien aus den Bereichen Klassifizierung, Named Entity Recognition and Linking, Relationship Extraction, Topic Modeling, usw. eingesetzt und entwickelt werden.

Wir bieten

  • Ein erstklassiges, aktives und inspirierendes Forschungsumfeld
  • Erfahrungen mit relevanten Technologien und Methoden
  • Enge Kooperation mit Projektpartnern und anderen Wissenschaftlern
  • Praxisnahe Forschung, reale Anwendungsfälle und Herausforderungen im Umfeld von Big Data Analysis
  • Ein attraktives Gehalt, vergleichbar mit TV-L 13

Wir erwarten

  • Einen sehr guten Universitätsabschluss in Informatik oder einem verwandten Fach
  • Erfahrung oder Spezialisierung im Bereich Text Mining/Textanalyse
  • Gute Programmierkenntnisse
  • Begeisterung für wissenschaftliches Arbeiten, wissenschaftliches Veröffentlichen und die Durchführung praxisnaher Projekte
  • Ein hohes Maß an Einsatzbereitschaft, Selbständigkeit, Flexibilität und Teamfähigkeit
  • Die Fähigkeit und Bereitschaft, sich mit neuen Technologien vertraut zu machen
  • Sehr gute Kommunikationsfähigkeiten, insbesondere die Beherrschung der englischen Sprache in Wort und Schrift

Für weitere Informationen über unsere Gruppe, unsere Forschung und Lehre besuchen Sie bitte unsere Webseite: https://hpi.de/naumann/web-science-group/wsg-info.html

Wir freuen uns, Ihre vollständigen Bewerbungsunterlagen in Deutsch oder Englisch zu erhalten. Bitte schicken Sie diese per PDF bis spätestens 21. November 2016 an: ralf.krestel(at)hpi.de

Bitte nennen Sie auch Ihren frühesten Starttermin, da wir die Stelle so schnell wie möglich besetzen möchten.

Open student assistant positions

Metadata Management System

Contact: Sebastian Kruse

What is this project about?
The Metadata Management System (MDMS) is an open source project that aims to unlock the potentials of data profiling results, i.e., metadata. Data profiling algorithms reveal latent properties of datasets, e.g., functional and inclusion dependencies, that are a prerequisite to many data management tasks, such as data integration, query optimization, and schema reverse engineering. However, the plain availability of oftentimes sheer amounts of metadata is in general not sufficient - instead the metadata require further processing. This is where the MDMS comes into play: It allows to store metadata in a structured manner and offers analyzing, interaction, and visualization capabilities, so as to detect relevant bits of the metadata and gain actual insights from the combination of different metadata types.

What are the tasks in this project?
The MDMS is evolving constantly. Not all of the components of the system are fully implemented yet. Also, we are seeking to employ modern technologies for the different components, from Apache Cassandra as data layer over Apache Flink and Spark as analytics layer to Apache Zeppelin and Jupyter as interaction and visualization tools. In this project, you will have to evaluate such technologies w.r.t. their suitability for the MDMS; (re-)implement parts of the MDMS; and demonstrate the capabilities of your solution with practical examples. Also, you are encouraged to contribute and discuss the overall design and concepts of the system.

What are the prerequisites to join this project?
MDMS has been implemented with Java and Scala and you should be somewhat proficient in one of these languages. We don't expect you to know all the different systems mentioned above, but you should be interested in learning and adopting such technologies.

What do I learn from the project?
There are clearly three things you can take away from the project: (1) You get to work on open source software. (2) You learn a lot about recent technologies in the database/information systems community. (3) You can familiarize yourselves with the topics of our chair, especially data profiling, which you might benefit from during your studies and your Master's theses.

Text data management

Contact: Ralf Krestel

What are the tasks of the position?
The current job position is provided by the Web Science group and its main focus is the retrieval and management of text data. Initially, the student will be responsible of managing the already available data of the chair, which are indexed in Elastic Search, an open source search and analytics engine. The datasets consist of German and English newspapers with their respective news articles and user comments, social data such as tweets, blog posts etc. Managing the above-mentioned collections includes various pre-processing steps and metadata extraction, in order to maintain their quality and usability.
In addition, the position features the task of developing crawling techniques for new data collections, cleaning/pre-processing the retrieved data, indexing them and possibly extracting useful metadata from them. Finally, a visualization tool, namely Kibana, is also used for discovering interesting patterns in the text corpora indexed in our Elastic Search instance.

What are the contract details?
The working hours will be 7 per week and the contract could begin as soon as possible. For further information about the job description and contract, please contact Dr. Ralf Krestel.