Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.
For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.
ExtracTable: Extracting Tables From Plain-Text Files (Leonardo Hübscher, 2021)
Publications
Lan Jiang, Gerardo Vitagliano, Mazhar Hameed, Felix Naumann: "Aggregation Detection in CSV Files". Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 (to appear)
Mazhar Hameed, Gerardo Vitagliano, Lan Jiang, Felix Naumann: "SURAGH: Syntactic Pattern Matching to Identify Ill-Formed Records". Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 (under revision)
Gerardo Vitagliano, Lan Jiang, Felix Naumann: "Detecting Layout Templates in Complex Multiregion Files". PVLDB. Accepted (2021).
Lan Jiang, Gerardo Vitagliano, Felix Naumann: "Structure Detection in Verbose CSV Files". Proceedings of the International Conference on Extending Database Technology (EDBT), 193–204, 2021
Koumarelas, Ioannis, Lan Jiang, and Felix Naumann. "Data Preparation for Duplicate Detection".Journal of Data and Information Quality (JDIQ) 12, no. 3 (2020): 1–24.
Lan Jiang, and Felix Naumann. "Holistic Primary Key and Foreign Key Detection".Journal of Intelligent Information Systems 54, no. 3 (2020): 439–461.
Lan Jiang, Gerardo Vitagliano, Felix Naumann: "A Scoring-based Approach for Data Preparator Suggestion".Lernen, Wissen, Daten, Analysen (LWDA), 2454:6–9, 2019
Dürsch, Falco, Axel Stebner, Fabian Windheuser, Maxi Fischer, Tim Friedrich, Nils Strelow, Tobias Bleifuß, Hazar Harmouch, Lan Jiang, Thorsten Papenbrock, and Felix Naumann. "Inclusion Dependency Discovery: An Experimental Evaluation of Thirteen Algorithms". In Proceedings of the International Conference on Information and Knowledge Management (CIKM), 219–228, 2019.
Lan Jiang, Hengyang Lu, Ming Xu, and Chongjun Wang. “Biterm Pseudo Document Topic Model for Short Text”. In IEEE International Conference on Tools With Artificial Intelligence, 865–872. IEEE, 2016
Yang Jun, Lan Jiang, Chongjun Wang, and Junyuan Xie. “Multi-Label Emotion Classification for Tweets in Weibo: Method and Application”. In IEEE International Conference on Tools With Artificial Intelligence, 424-428, IEEE, 2014.