For bachelor students we offer German lectures on database systems in addition with paper- or project-oriented seminars. Within a one-year bachelor project students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, search engines and information retrieval enhanced by specialized seminars, master projects and advised master theses.
The Web Science group focuses on various topics related to the Web, such as Information Retrieval, Natural Language Processing, Data Mining, Knowledge Discovery, Social Network Analysis, Entity Linking, and Recommender Systems. The group is particularly interested in Text Mining to deal with the vast amount of unstructured and semi-structured information available on the Web.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our data sets and source code.
Analytical Tools for Semantic Company Networks (2015/2016)
From Text to Facts: Relation Extraction on German Company Websites (Tanja Bergmann, 2016)
German Organization Name Part Classification (Manuel Hegner, 2016)
Comparing Features for Ranking Relationships Between Financial Entities Based on Text
Repke, Tim; Loster, Michael; Krestel, Ralf
Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets
New York, NY, USA
Evaluating the credibility of a company is an important and complex task for financial experts. When estimating the risk associated with a potential asset, analysts rely on large amounts of data from a variety of different sources, such as newspapers, stock market trends, and bank statements. Finding relevant information, such as relationships between financial entities, in mostly unstructured data is a tedious task and examining all sources by hand quickly becomes infeasible. In this paper, we propose an approach to rank extracted relationships based on text snippets, such that important information can be displayed more prominently. Our experiments with different numerical representations of text have shown, that ensemble of methods performs best on labelled data provided for the FEIII Challenge 2017.