Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.
For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.
Dirty XML Generator. Tool to generate inexact duplicates in XML data.
XQuery Generator. Tool to graphically support XQuery generation.
Teaching
Courses
Winter 07/08: "Schema Matching" seminar
Summer 07: "Data Cleaning" seminar
Winter 06/07:"Data fusion in three steps" seminar
Winter 04/05: Practical course "Information Integration" at Humboldt University Berlin
Sommer 04: Practical course "Information Integration II" at Humboldt University Berlin
Professional Activities
Program committee member of the DataX 2008 workshop
Reviewer for IS, JDIQ, TOIT
Publications
Conferences
Structure-Based Inference of XML Similarity for Fuzzy Duplicate Detection<br>Luis Leitao, Pavel Calado, and Melanie Weis. <i>CIKM 2007</i>, Lisboa, Portugal.
Declarative XML Data Cleaning with XClean<br>Melanie Weis and Ioana Manolescu. <i>CAISE 2007</i>, Trondheim, Norway.
XML Duplicate Detection Using Sorted Neighborhoods <br> Sven Puhlmann, Melanie Weis and Felix Naumann. <i>EDBT 2006</i>, Munich, Germany.
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/SIGMOD05.pdf">DogmatiX Tracks down Duplicates in XML </a><br>Melanie Weis and Felix Naumann. <i>SIGMOD 2005</i>, Baltimore, MD.
Workshops
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/benchmark_iqis06.pdf">A Duplicate Detection Benchmark for XML (and Relational) Data</a><br> Melanie Weis, Felix Naumann and Franziska Brosy. <i>SIGMOD 2006 Workshop on Information Quality for Information Systems (IQIS)</i>, Chicago, IL
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/XSDM06.pdf">XStruct: Efficient Schema Extraction from Multiple and Large XML Documents</a><br>Jan Hegewald, Felix Naumann and Melanie Weis. <i>ICDE 2006 Workshop on XML Schema and Data Management (XSDM)</i>, Atlanta, Georgia.
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/VLDB05Phd_xmloid.pdf">Fuzzy Duplicate Detection on XML Data</a><br> Melanie Weis. <i>VLDB 2005 PhD Workshop</i>, Trondheim, Norway.
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/IQIS04.pdf">Detecting Duplicate Objects in XML Documents</a><br>Melanie Weis and Felix Naumann. <i>SIGMOD 2004 Workshop on Information Quality for Information Systems (IQIS)</i> , Paris, France.
Posters & Demos
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/xclean_crv.pdf">XClean in Action (demo)</a><br>Melanie Weis and Ioana Manolescu.<i>CIDR 2007 </i>, Asilomar, California. To Appear.
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/ICDE06.pdf">Detecting Duplicates in Complex XML Data (poster)</a><br>Melanie Weis and Felix Naumann. <i>ICDE 2006</i>, Atlanta, Georgia.
<a href="fileadmin/user_upload/fachgebiete/naumann/publications/VLDB2005.pdf">Automatic Data Fusion with HumMer (demo)</a><br>Alexander Bilke, Jens Bleiholder, Christoph Böhm, Karsten Draba, Felix Naumann, Melanie Weis. <i>VLDB 2005</i>, Troindheim, Norway.
Journals
<a href="http://www.hpi.uni-potsdam.de/fileadmin/user_upload/fachgebiete/naumann/publications/DEBull06.pdf">Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies</a><br>Felix Naumann, Alexander Bilke, Jens Bleiholder, Melanie Weis. <i>Bulletin of the Technical Committee on Data Engineering, Vol. 29 No. 2, June 2006</i>, 21-31.
Erkennen und Bereinigen von Datenfehlern in naturwissenschaftlichen Daten (german)<br> Heiko Müller, Melanie Weis, Jens Bleiholder and Ulf Leser. <i>Datenbank-Spektrum, Heft 15, November 2005</i>, 36-43.
Eine Übung zur Vorlesung Informationsintegration (german)<br>Felix Naumann, Jens Bleiholder, Melanie Weis. <i>Datenbank-Spektrum Heft 11, November 2004</i>, 50-52.