Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.
For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.
Survey - Data preparation from industry perspective: A survey
Suragh - Detecting ill-formed Rows in CSV Files
Tasheeh - Cleaning ill-formed Rows in CSV Files
Publications
M. Hameed, G. Vitagliano, F. Panse, F. Naumann: TASHEEH: Repairing Row-Structure in Raw CSV Files, Proceedings of the International Conference on Extending Database Technology (EDBT), 2024
M. Hameed, G. Vitagliano, F. Naumann: MORPHER: Structural Transformation of ill-formed Rows, Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2023
G. Vitagliano, M. Hameed, L. Reisener, L. Jiang, E. Wu, F. Naumann: Pollock: A Data Loading Benchmark, Proceedings of the VLDB Endowment (PVLDB), 2023.
G. Vitagliano, M. Hameed, F. Naumann: Structural embedding of data files with MaGRiTTE. Table Representation Learning Workshop at NeurIPS (TRL@NeurIPS), 2022.
G. Vitagliano, L. Reisener, L. Jiang, M. Hameed, F. Naumann: Mondrian: Spreadsheet Layout Detection. Proceedings of the International Conference on Management of Data (SIGMOD), 2022
M. Hameed, G. Vitagliano, L. Jiang, F. Naumann: SURAGH: Syntactic Pattern Matching to Identify Ill-Formed Records. Proceedings of the International Conference on Extending Database Technology (EDBT), 2022
L. Jiang, G. Vitagliano, M. Hameed, F. Naumann: Aggregation Detection in CSV Files. Proceedings of the International Conference on Extending Database Technology (EDBT), 2022
M. Hameed, F. Naumann: Data Preparation: A Survey of Commercial Tools. SIGMOD Record 49:(3), 2020