Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.
For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.
PRISMA: A Privacy-Preserving Schema Matcher using Functional Dependencies Jan-Eric Hellenberg, Fabian Mahling, Lukas Laskowski, Felix Naumann, Matteo Paganelli, Fabian Panse Proceedings of the 28th International Conference on Extending Database Technology (EDBT), 2025 (to appear)
2024
Shact: Disentangling and Clustering Latent Syntactic Structures from Transformer Encoders Alejandro Sierra-Múnera, Ralf Krestel Proceedings of the 29th International Conference on Natural Language & Information Systems (NLDB), 2024 [Paper][GitHub][DOI:10.1007/978-3-031-70239-6_25]
An Introduction to Machine Learning from Time Series Anthony Bagnall, Matthew Middlehurst, Germain Forestier, Ali Ismail-Fawaz, Antoine Guillaume, David Guijo-Rubio, Arik Ermshaus, Patrick Schäfer, Thorsten Papenbrock, Phillip Wenig, Sebastian Schmidl Proceedings of the European Conference on Machine Learning and Data Mining (ECML PKDD), 2024 (to appear)
Anomaly Detectors for Multivariate Time Series: The Proof of the Pudding is in the Eating Phillip Wenig, Sebastian Schmidl, Thorsten Papenbrock Proceedings of the International Conference on Data Engineering Workshops (ICDEW), 2024 [Paper][DOI:10.1109/ICDEW61823.2024.00018]
The Effects of Data Quality on Named Entity Recognition Divya Bhadauria, Alejandro Sierra-Múnera, Ralf Krestel Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024), 2024 [Paper][GitHub]
Determining the Largest Overlap between Tables Luca Zecchini, Tobias Bleifuß, Giovanni Simonini, Sonia Bergamaschi, Felix Naumann Proceedings of the ACM on Management of Data (PACMMOD) (2024) [DOI:10.1145/3639303]
Discovering Functional Dependencies through Hitting Set Enumeration Tobias Bleifuß, Thorsten Papenbrock, Thomas Bläsius, Martin Schirneck, Felix Naumann Proceedings of the ACM on Management of Data (PACMMOD) (2024) [DOI:10.1145/3639298]
TASHEEH: Repairing Row-Structure in Raw CSV Files Mazhar Hameed, Gerardo Vitagliano, Fabian Panse, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2024 [Paper][DOI:10.48786/edbt.2024.37]
Efficient Discovery of Temporal Inclusion Dependencies in Wikipedia Tables Leon Bornemann, Tobias Bleifuß, Dmitri V. Kalashnikov, Fatemeh Nargesian, Felix Naumann, Divesh Srivastava Proceedings of the International Conference on Extending Database Technology (EDBT), 2024 [Paper][DOI:10.48786/edbt.2024.35]
Discovering Denial Constraints in Dynamic Datasets Eduardo Pena, Fabio Porto, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2024 [Paper][IEEE]
2023
MORPHER: Structural Transformation of ill-formed Rows Mazhar Hameed, Gerardo Vitagliano, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2023 [DOI:10.1145/3583780.3614747]
Efficient Ultrafine Typing of Named Entities Alejandro Sierra-Múnera, Jan Westphal, Ralf Krestel Proceedings of the Joint Conference on Digital Libraries (JCDL), 2023 [Paper][DOI:10.1109/JCDL57899.2023.00038]
Pollock: A Data Loading Benchmark Gerardo Vitagliano, Mazhar Hameed, Lan Jiang, Lucas Reisener, Eugene Wu, Felix Naumann PVLDB 16:(8), 2023 [vldb]
BCNF* - From Normalized- to Star-Schemas and Back Again (demo) Marie Fischer, Paul Roessler, Paul Sieben, Janina Adamcic, Christoph Kirchherr, Tobias Sträubig, Youri Kaminsky, Felix Naumann Proceedings of Companion of the 2023 International Conference on Management of Data (SIGMOD-Companion), 2023 [Paper][Project Page][DOI:10.1145/3555041.3589712]
Detecting Stale Data in Wikipedia Infoboxes Malte Barth, Tibor Bleidt, Martin Büßemeyer, Fabian Heseding, Niklas Köhnecke, Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava Proceedings of the International Conference on Extending Database Technology (EDBT), 2023 [Paper][Project Page]
DPQL: The Data Profiling Query Language Marcian Seeger, Sebastian Schmidl, Alexander Vielhauer, Thorsten Papenbrock Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2023 [Paper][DOI:10.18420/BTW2023-19]
HYPEX: Hyperparameter Optimization in Time Series Anomaly Detection Sebastian Schmidl, Phillip Wenig, Thorsten Papenbrock Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2023 [Paper][Project Page][DOI:10.18420/BTW2023-22]
ExtracTable: Extracting Tables from Raw Data Files Leonardo Hübscher, Lan Jiang, Felix Naumann Proceedings of the Conference on Database Systems for Business, Technology, and Web (BTW), 2023 [Paper][Project Page][DOI:10.18420/BTW2023-20]
Discovering Similarity Inclusion Dependencies Youri Kaminsky, Eduardo Pena, Felix Naumann Proceedings of the ACM on Management of Data (PACMMOD) (2023) [Paper][Project Page][DOI:10.1145/3588929]
Matching Roles from Temporal Data - Why Joe Biden is not only President, but also Commander-in-Chief Leon Bornemann, Tobias Bleifuß, Dmitri V. Kalashnikov, Fatemeh Nargesian, Felix Naumann, Divesh Srivastava Proceedings of the ACM on Management of Data (PACMMOD) (2023) [DOI:10.1145/3588919]
Fast Algorithms for Denial Constraint Discovery Eduardo Pena, Fabio Porto, Felix Naumann PVLDB 16:(4), 2023 [PVLDB]
2022
The Effects of Data Quality on Machine Learning Performance Lukas Budach, Moritz Feuerpfeil, Nina Ihde, Andrea Nathansen, Nele Noack, Hendrik Patzlaff, Felix Naumann, Hazar Harmouch arXiv (2022) [arXiv]
Discovering Fine-Grained Semantics in Knowledge Graph Relations Nitisha Jain, Ralf Krestel Proceedings of the Thirty-First ACM International Conference on Information and Knowledge Management (CIKM), 2022
Structural embedding of data files with MaGRiTTE Gerardo Vitagliano, Mazhar Hameed, Felix Naumann Table Representation Learning Workshop at NeurIPS (TRL@NIPS), 2022 [paper][project]
Art Creation with Multi-Conditional StyleGANs Konstantin Dobler, Florian Hübscher, Jan Westphal, Alejandro Sierra-Múnera, Gerard de Melo, Ralf Krestel Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI), 2022 [IJCAI][Extended arXiv Version][DOI:10.24963/ijcai.2022/684]
Generation of Training Data for Named Entity Recognition of Artworks Nitisha Jain, Alejandro Sierra-Múnera, Jan Ehmueller, Ralf Krestel Semantic Web Journal (Special Issue Cultural Heritage 2021) (2022) [Preprint]
Mondrian: Spreadsheet Layout Detection Gerardo Vitagliano, Lucas Reisener, Lan Jiang, Mazhar Hameed, Felix Naumann Proceedings of the International Conference on Management of Data (SIGMOD) (demo), 2022 [Paper][ACM][DOI:10.1145/3514221.3520152]
Frost: A Platform for Benchmarking and Exploring Data Matching Results (industry paper) Martin Graf, Lukas Laskowski, Florian Papsdorf, Florian Sold, Roland Gremmelspacher, Felix Naumann, Fabian Panse PVLDB 15:(12), 2022 [Paper][Project Page]
Data Errors: Symptoms, Causes and Origins Ihab Ilyas, Felix Naumann Data Engineering Bulletin 45:(1), 2022 [pdf]
Relation Canonicalization in Open Knowledge Graphs: A Quantitative Analysis Maria Lomaeva, Nitisha Jain Proceedings of the the Extended Semantic Web Conference, Posters and Demos (ESWC), 2022
Generating Domain-Specific Knowledge Graphs: Challenges with Open Information Extraction Nitisha Jain, Alejandro Sierra-Múnera, Philipp Schmidt, Julius Streit, Simon Thormeyer, Maria Lomaeva, Ralf Krestel Proceedings of the International Workshop on Knowledge Graph Generation from Text at ESWC, 2022 [Paper]
AI Compliance - Challenges of Bridging Data Science and Law Philipp Hacker, Felix Naumann, Tobias Friedrich, Stefan Grundmann, Anja Lehmann, Herbert Zech Journal of Data and Information Quality (JDIQ) (2022) [DOI (open access)]
SURAGH: Syntactic Pattern Matching to Identify Ill-Formed Records Mazhar Hameed, Gerardo Vitagliano, Lan Jiang, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 [DOI]
Mining Change Rules Daniel Lindner, Franziska Schumann, Nicolas Alder, Tobias Bleifuß, Leon Bornemann, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 [DOI]
DataGossip: A Data Exchange Extension for Distributed Machine Learning Algorithms Phillip Wenig, Thorsten Papenbrock Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 [Paper][GitHub][DOI:10.48786/edbt.2022.24]
Aggregation Detection in CSV Files Lan Jiang, Gerardo Vitagliano, Mazhar Hameed, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2022 [DOI]
Entity Resolution On-Demand Giovanni Simonini, Luca Zecchini, Sonia Bergamaschi, Felix Naumann PVLDB 15:(7), 2022 [Paper]
Fast Detection of Denial Constraint Violations Eduardo H. M. Pena, Eduardo C. de Almeida, Felix Naumann PVLDB 15:(4), 2022 [VLDB][DOI:10.14778/3503585.3503595]
Workload-driven, Lazy Discovery of Data Dependencies for Query Optimization Jan Kossmann, Felix Naumann, Daniel Lindner, Papenbrock Thorsten Proceedings of the International Conference on Innovative Database Research (CIDR), 2022 [pdf]
Data dependencies for query optimization: a survey Jan Kossmann, Thorsten Papenbrock, Felix Naumann The VLDB Journal (2022) [Paper][pdf][doi]
2021
How Inclusive are We? An Analysis of Gender Diversity in Database Venues Angela Bonifati, Michael J. Mior, Felix Naumann, Noack Nele Sina SIGMOD Record 50:(4), 2021 [Paper][ACM]
VLDB 2021: Designing a Hybrid Conference Philippe Bonnet, Xin Luna Dong, Felix Naumann, Tözün Pinar SIGMOD Record 50:(4), 2021 [Paper][ACM]
Did You Enjoy the Last Supper? An Experimental Study on Cross-Domain NER Models for the Art Domain Alejandro Sierra-Múnera, Ralf Krestel Proceedings of the Workshop on Natural Language Processing for Digital Humanities (NLP4DH@ICON), 2021 [Paper][GitHub]
Novel Views on Novels: Embedding Multiple Facets of Long Texts Lasse Kohlmeyer, Tim Repke, Ralf Krestel Proceedings of the International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2021 [Paper][GitHub Code][GitHub Thesis][DOI:10.1145/3486622.3494006]
Interactive Curation of Semantic Representations in Digital Libraries Tim Repke, Ralf Krestel Proceedings of the International Conference on Asia-Pacific Digital Libraries (ICADL), 2021 [Paper]
The Secret Life of Wikipedia Tables Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava Proceedings of the Workshop on Search, Exploration, and Analysis in Heterogeneous Datastores (SEA-Data@VLDB), 2021 [Paper][CEUR-WS][Project]
Improving Knowledge Graph Embeddings with Ontological Reasoning Nitisha Jain, Trung-Kien Tran, Mohamed H. Gad-Elrab, Daria Stepanova Proceedings of the International Semantic Web Conference (ISWC), 2021 [Paper]
PatentMatch: A Dataset for Matching Patent Claims & Prior Art Julian Risch, Nicolas Alder, Christoph Hewel, Ralf Krestel Proceedings of the Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech@SIGIR), 2021 [Paper][Project Page]
Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in One Unified Format Julian Risch, Philipp Schmidt, Ralf Krestel Proceedings of the Workshop on Online Abuse and Harms (WOAH@ACL), 2021 [Paper][GitHub]
CrashNet: an encoderdecoder architecture to predict crash test outcomes Mohamed Karim Belaid, Maximilian Rabus, Ralf Krestel Data Mining and Knowledge Discovery (2021) [Springer][DOI:10.1007/s10618-021-00761-9]
Modeling the Evolution of Word Senses with Force-Directed Layouts of Co-occurrence Networks Robert Schwanhold, Tim Repke, Ralf Krestel Proceedings of the International Workshop on Computational Approaches to Historical Language Change (LChange@ACL), 2021 [Paper][Project][DOI:10.18653/v1/2021.lchange-1.8]
Evaluation of Duplicate Detection Algorithms: From Quality Measures to Test Data Generation (tutorial) Fabian Panse, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2021 [Paper]
Multifaceted Domain-Specific Document Embeddings Julian Risch, Philipp Hager, Ralf Krestel Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)(NAACL), 2021 [Paper][Project Page]
Optimized Theta-Join Processing Julian Weise, Sebastian Schmidl, Thorsten Papenbrock Proceedings of the Conference on Database Systems for Business, Technology, and Web (BTW), 2021 [Paper][Project Page][DOI:10.18420/btw2021-03]
Do Embeddings Actually Capture Knowledge Graph Semantics? Nitisha Jain, Jan-Christoph Kalo, Wolf-Tilo Balke, Ralf Krestel Proceedings of the Extended Semantic Web Conference (ESWC), 2021 [Paper][URL][DOI:10.1007/978-3-030-77385-4_9]
Structured Object Matching across Web Page Revisions Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava Proceedings of the International Conference on Data Engineering (ICDE), 2021 [Paper][IEEE][Project][DOI:10.1109/ICDE51399.2021.00115]
ComEx: Comment Exploration on Online News Platforms Julian Risch, Tim Repke, Lasse Kohlmeyer, Ralf Krestel Joint Proceedings of the ACM IUI Workshops co-located with the ACM Conference on Intelligent User Interfaces (IUI), 2021 [Paper][GitHub][Project][CEUR-WS]
Relational Header Discovery using Similarity Search in a Table Corpus Hazar Harmouch, Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE) (2021) [DOI:10.1109/ICDE51399.2021.00045]
Structure Detection in Verbose CSV Files Lan Jiang, Gerardo Vitagliano, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2021 [Paper][GitHub][DOI:10.5441/002/edbt.2021.18]
Discovering Relaxed Functional Dependencies based on Multi-attribute Dominance Loredana Caruccio, Vincenzo Deufemia, Felix Naumann, Giuseppe Polese Transactions on Knowledge and Data Engineering (TKDE) 33:(9), 2021 [IEEE][DOI:10.1109/TKDE.2020.2967722]
Few-Shot Knowledge Validation using Rules Michael Loster, Davide Mottin, Paolo Papotti, Felix Naumann, Jan Ehmueller, Benjamin Feldmann Proceedings of The Web Conference (WWW), 2021 [DOI:10.1145/3442381.3450040]
PatentMatch: A Dataset for Matching Patent Claims with Prior Art Julian Risch, Nicolas Alder, Christoph Hewel, Ralf Krestel Proceedings of the Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech@SIGIR), 2021 [Paper][Project Page][CEUR-WS]
Robust Visualisation of Dynamic Text Collections: Measuring and Comparing Dimensionality Reduction Algorithms Tim Repke, Ralf Krestel Proceedings of the Conference on Human Information Interaction and Retrieval (CHIIR), 2021 [Paper][DOI:10.1145/3406522.3446034]
Knowledge Transfer for Entity Resolution with Siamese Neural Networks Michael Loster, Ioannis Koumarelas, Felix Naumann Journal of Data and Information Quality (JDIQ) 13:(1), 2021 [DOI:10.1145/3410157]
2020
Semantic Analysis of Cultural Heritage Data: Aligning Paintings and Descriptions in Art-Historic Collections Nitisha Jain, Christian Bartz, Tobias Bredow, Emanuel Metzenthin, Jona Otholt, Ralf Krestel Proceedings of the International Workshop on Fine Art Pattern Extraction and Recognition (FAPER@ICPR), 2020 [Paper][Springer][DOI:10.1007/978-3-030-68796-0_37]
HyCoNN: Hybrid Cooperative Neural Networks for Personalized News Discussion Recommendation Julian Risch, Victor Künstler, Ralf Krestel Proceedings of the International Joint Conferences on Web Intelligence and Intelligent Agent Technologies (WI-IAT), 2020 [Paper][GitHub][DOI:10.1109/WIIAT50758.2020.00011]
Learning Fine-Grained Semantics for Multi-Relational Data Nitisha Jain, Ralf Krestel Proceedings of the International Semantic Web Conference, Posters and Demos (ISWC), 2020 [Paper][Poster]
Efficient Detection of Data Dependency Violations Eduardo H. M. Pena, Edson R. L. Filho, Eduardo C. de Almeida, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2020 [Paper][DOI:10.1145/3340531.3412062]
Hitting Set Enumeration with Partial Information for Unique Column Combination Discovery Johann Birnick, Thomas Bläsius, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, Martin Schirneck PVLDB 13:(11), 2020 [Paper][DOI:10.14778/3407790.3407824]
A Dataset of Journalists' Interactions with Their Readership: When Should Article Authors Reply to Reader Comments? Julian Risch, Ralf Krestel Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2020 [Paper][GitHub][DOI:10.1145/3340531.3412764]
Dynamic Channel and Layer Gating in Convolutional Neural Networks Ali Ehteshami Bejnordi, Ralf Krestel Proceedings of the German Conference on Artificial Intelligence (KI), 2020 [Paper][DOI:10.1007/978-3-030-58285-2_3]
Sense Tree: Discovery of New Word Senses with Graph-based Scoring Jan Ehmüller, Lasse Kohlmeyer, Holly McKee, Daniel Paeschke, Tim Repke, Ralf Krestel, Felix Naumann Lernen, Wissen, Daten, Analysen (LWDA), 2020 [Paper][CEUR-WS][Project]
Multimodal Knowledge Graphs for Semantic Analysis of Cultural Heritage Data Nitisha Jain Invited Talk at the Workshop on Knowledge Bases and Multiple Modalities (KBMM@AKBC), 2020 [Paper]
Efficient Discovery of Matching Dependencies Philipp Schirmer, Thorsten Papenbrock, Ioannis Koumarelas, Felix Naumann Transactions on Database Systems (TODS) 45:(3), 2020 [Paper][DOI:10.1145/3392778]
Explaining Offensive Language Detection Julian Risch, Robin Ruff, Ralf Krestel Journal for Language Technology and Computational Linguistics (JLCL) 34:(1), 2020 [Paper][GitHub][Publisher]
Discovering Biased News Articles Leveraging Multiple Human Annotations Konstantina Lazaridou, Alexander Löser, Maria Mestre, Felix Naumann Proceedings of the Conference on Language Resources and Evaluation (LREC), 2020 [Paper][Paper]
Offensive Language Detection Explained Julian Risch, Robin Ruff, Ralf Krestel Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@LREC), 2020 [Paper][GitHub][ACL]
Hierarchical Document Classification as a Sequence Generation Task Julian Risch, Samuele Garda, Ralf Krestel Proceedings of the Joint Conference on Digital Libraries (JCDL), 2020 [Paper][GitHub][DOI:10.1145/3383583.3398538]
RHEEMix in the Data Jungle: A Cost-based Optimizer for Cross-Platform Systems Sebastian Kruse, Zoi Kaoudi, Jorge-Arnulfo Quiane-Ruiz, Sanjay Chawla, Felix Naumann, Bertty Contreras-Rojas The VLDB Journal 29:(6), 2020 [URL][DOI:10.1007/s00778-020-00612-x]
Bagging BERT Models for Robust Aggression Identification Julian Risch, Ralf Krestel Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@LREC), 2020 [Paper][GitHub]
Domain-Specific Knowledge Graph Construction for Semantic Analysis Nitisha Jain Proceedings of the Extended Semantic Web Conference (ESWC), 2020 [Paper][URL][DOI:10.1007/978-3-030-62327-2_40]
Automatic Matching of Paintings and Descriptions in Art-Historic Archives using Multimodal Analysis Nitisha Jain, Christian Bartz, Ralf Krestel Proceedings of the International Workshop on Artificial Intelligence for Historical Image Enrichment and Access (AI4HI@LREC), 2020 [Paper][URL]
Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions Julian Risch, Ralf Krestel Proceedings of the International Conference on Web and Social Media (ICWSM), 2020 [Paper][GitHub]
Visualising Large Document Collections by Jointly Modeling Text and Network Structure Tim Repke, Ralf Krestel Proceedings of the Joint Conference on Digital Libraries (JCDL), 2020 [Paper][Project][DOI:10.1145/3383583.3398524]
Exploration Interface for Jointly Visualised Text and Graph Data Tim Repke, Ralf Krestel Proceedings of the International Conference on Intelligent User Interfaces Companion (IUI), 2020 [Paper][Project][DOI:10.1145/3379336.3381470]
Natural Key Discovery in Wikipedia Tables Leon Bornemann, Tobias Bleifuß, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava Proceedings of The Web Conference (WWW), 2020 [Paper][DOI:10.1145/3366423.3380039]
Data Preparation for Duplicate Detection Ioannis Koumarelas, Lan Jiang, Felix Naumann Journal of Data and Information Quality (JDIQ) 12:(3), 2020 [DOI:10.1145/3377878]
Explainable AI under Contract and Tort Law: Legal Incentives and Technical Challenges Philipp Hacker, Ralf Krestel, Stefan Grundmann, Felix Naumann Artificial Intelligence and Law 28:(4), 2020 [Paper][DOI:10.1007/s10506-020-09260-6]
MDedup: Duplicate Detection with Matching Dependencies Ioannis Koumarelas, Thorsten Papenbrock, Felix Naumann PVLDB 13:(5), 2020 [Paper][DOI:10.14778/3377369.3377379]
Holistic Primary Key and Foreign Key Detection Lan Jiang, Felix Naumann Journal of Intelligent Information Systems 54:(3), 2020 [Paper][DOI:10.1007/s10844-019-00562-z]
Toxic Comment Detection in Online Discussions Julian Risch, Ralf Krestel Deep Learning-Based Approaches for Sentiment Analysis. Springer, 2020 [Paper][DOI:10.1007/978-981-15-1216-2]
2019
An Actor Database System for Akka Sebastian Schmidl, Frederic Schneider, Thorsten Papenbrock Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW) - Workshopband, 2019 [Paper][DOI:10.18420/btw2019-ws-23]
Coverage of Information Extraction from Sentences and Paragraphs Simon Razniewski, Nitisha Jain, Paramita Mirza, Gerhard Weikum Proceedings of the Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019 [Paper][ACL Web][DOI:10.18653/v1/D19-1583]
Discovery of Approximate (and Exact) Denial Constraints Eduardo H. M. Pena, Eduardo C. de Almeida, Felix Naumann PVLDB 13:(3), 2019 [Paper][DOI:10.14778/3368289.3368293]
hpiDEDIS at GermEval 2019: Offensive Language Identification using a German BERT model Julian Risch, Anke Stoll, Marc Ziegele, Ralf Krestel Proceedings of the Conference on Natural Language Processing (KONVENS), 2019 [Paper][GitHub]
A Scoring-based Approach for Data Preparator Suggestion Lan Jiang, Gerardo Vitagliano, Felix Naumann Lernen, Wissen, Daten, Analysen (LWDA), 2019 [Paper]
Inclusion Dependency Discovery: An Experimental Evaluation of Thirteen Algorithms Falco Dürsch, Axel Stebner, Fabian Windheuser, Maxi Fischer, Tim Friedrich, Nils Strelow, Tobias Bleifuß, Hazar Harmouch, Lan Jiang, Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2019 [Paper][Code][DOI:10.1145/3357384.3357916]
Transforming Pairwise Duplicates to Entity Clusters for High Quality Duplicate Detection Uwe Draisbach, Peter Christen, Felix Naumann Journal of Data and Information Quality (JDIQ) 12:(1), 2019 [Paper][DOI:10.1145/3352591]
Who is Mona L.? Identifying Mentions of Artworks in Historical Archives Nitisha Jain, Ralf Krestel International Conference on Theory and Practice of Digital Libraries (TPDL), 2019 [Paper][Springer][DOI:10.1007/978-3-030-30760-8_10]
Mining Business Relationships from Stocks and News Thomas Kellermeier, Tim Repke, Ralf Krestel Proceedings of the Workshop on Mining Data for Financial Applications (MIDAS@ECML-PKDD), 2019 [Paper][DOI:10.1007/978-3-030-37720-5_6]
DynFD: Functional Dependency Discovery in Dynamic Datasets Philipp Schirmer, Thorsten Papenbrock, Sebastian Kruse, Felix Naumann, Dennis Hempfing, Torben Mayer, Daniel Neuschäfer-Rube Proceedings of the International Conference on Extending Database Technology (EDBT), 2019 [Paper][DOI:10.5441/002/edbt.2019.23]
The relational database management systems genealogy Felix Naumann Making Databases Work. ACM / Morgan & Claypool, 2019 [Paper][DOI:10.1145/3226595.3226611]
Optimizing Cross-Platform Data Movement Sebastian Kruse, Zoi Kaoudi, Jorge-Arnulfo Quiané-Ruiz, Sanjay Chawla, Felix Naumann, Bertty Contreras-Rojas Proceedings of the International Conference on Data Engineering (ICDE), 2019 [Paper][DOI:10.1109/ICDE.2019.00162]
DBChEx: Interactive Exploration of Data and Schema Change Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava Proceedings of the Conference on Innovative Data Systems Research (CIDR), 2019 [Paper][CIDRDB]
2018
CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text Michael Loster, Felix Naumann, Jan Ehmueller, Benjamin Feldmann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2018 [Paper][DOI:10.1145/3269206.3269229]
Dissecting Company Names using Sequence Labeling Michael Loster, Manuel Hegner, Felix Naumann, Ulf Leser Lernen, Wissen, Daten, Analysen (LWDA), 2018 [Paper][Paper]
Towards Progressive Search-driven Entity Resolution Alberto Pietrangelo, Giovanni Simonini, Sonia Bergamaschi, Felix Naumann, Ioannis Koumarelas Italian Symposium on Advanced Database Systems (SEBD), 2018 [Paper][Paper]
Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection Ioannis Koumarelas, Axel Kroschk, Clifford Mosley, Felix Naumann Journal of Data and Information Quality (JDIQ) 10:(2), 2018 [Paper][DOI:10.1145/3232852]
The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities Michael Loster, Tim Repke, Ralf Krestel, Felix Naumann, Jan Ehmueller, Benjamin Feldmann, Oliver Maspfuhl Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling (DSMM), 2018 [Paper][DOI:10.1145/3220547.3220553]
Data Profiling Ziawasch Abedjan, Lukasz Golab, Felix Naumann, Thorsten Papenbrock Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2018 [M&C][DOI:10.2200/S00878ED1V01Y201810DTM052]
Exploring Change - A New Dimension of Data Analytics Tobias Bleifuß, Leon Bornemann, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava PVLDB 12:(2), 2018 [Paper][PVLDB][DOI:10.14778/3282495.3282496]
Book Recommendation Beyond the Usual Suspects: Embedding Book Plots Together with Place and Time Information Julian Risch, Samuele Garda, Ralf Krestel Proceedings of the International Conference On Asia-Pacific Digital Libraries (ICADL), 2018 [Paper][GitHub][DOI:10.1007/978-3-030-04257-8_24]
Fine-Grained Classification of Offensive Language Julian Risch, Eva Krebs, Alexander Löser, Alexander Riese, Ralf Krestel Proceedings of GermEval (co-located with KONVENS), 2018 [Paper]
Learning Patent Speak: Investigating Domain-Specific Word Embeddings Julian Risch, Ralf Krestel Proceedings of the International Conference on Digital Information Management (ICDIM), 2018 [Paper][Project Page][DOI:10.1109/ICDIM.2018.8846972]
Challenges for Toxic Comment Classification: An In-Depth Error Analysis Betty van Aken, Julian Risch, Ralf Krestel, Alexander Löser Proceedings of the Workshop on Abusive Language Online (ALW@EMNLP), 2018 [Paper][DOI:10.18653/v1/w18-5105]
Beacon in the Dark: A System for Interactive Exploration of Large Email Corpora Tim Repke, Ralf Krestel, Jakob Edding, Moritz Hartmann, Jonas Hering, Dennis Kipping, Hendrik Schmidt, Nico Scordialo, Alexander Zenner Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2018 [Paper v1][Paper v2][Project][DOI:10.1145/3269206.3269231]
RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! - Divy Agrawal, Sanjay Chawla, Zoi Kaoudi, Sebastian Kruse, Jorge Arnulfo Quiané-Ruiz, Bertty Contreras-Rojas, Ahmed Elmagarmid, Yasser Idris, Ji Lucas, Essam Mansour, Mourad Ouzzani, Paolo Papotti, Nan Tang, Saravanan Thirumuruganathan, Anis Troudi PVLDB 11:(11), 2018 [Paper][DOI:10.14778/3236187.3236195]
Piggyback Profiling: Enhancing Query Results with Metadata Claudia Exeler, Maria Graber, Tino Junge, Stefan Ramson, Cathleen Ramson, Fabian Tschirschnitz, Felix Naumann Lernen, Wissen, Daten, Analysen (LWDA), 2018 [Paper]
Aggression Identification Using Deep Learning and Data Augmentation Julian Risch, Ralf Krestel Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@COLING), 2018 [Paper][GitHub]
Delete or not Delete? Semi-Automatic Comment Moderation for the Newsroom Julian Risch, Ralf Krestel Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@COLING), 2018 [Paper]
Data Change Exploration using Time Series Clustering Leon Bornemann, Tobias Bleifuß, Dmitri Kalashnikov, Felix Naumann, Divesh Srivastava Datenbank-Spektrum 18:(2), 2018 [Paper][DOI:10.1007/s13222-018-0285-x]
WELDA: Enhancing Topic Models by Incorporating Local Word Contexts Stefan Bunk, Ralf Krestel Proceedings of the Joint Conference on Digital Libraries (JCDL), 2018 [Paper][DOI:10.1145/3197026.3197043]
Prediction for the Newsroom: Which Articles Will Get the Most Comments? Carl Ambroselli, Julian Risch, Ralf Krestel, Andreas Loos Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2018 [Paper][GitHub][DOI:10.18653/v1/n18-3024]
Where in the World Is Carmen Sandiego? Detecting Person Locations via Social Media Discussions Konstantina Lazaridou, Toni Gruetze, Felix Naumann Proceedings of the ACM Conference on Web Science (WebSci), 2018 [Paper][URL][DOI:10.1145/3201064.3201068]
My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections Julian Risch, Ralf Krestel Proceedings of the Joint Conference on Digital Libraries (JCDL), 2018 [Paper][GitHub][arXiv][DOI:10.1145/3197026.3197038]
Discovery of Genuine Functional Dependencies from Relational Data with Missing Values Laure Berti-Equille, Hazar Harmouch, Felix Naumann, Noel Novelli, Saravanan Thirumuruganathan PVLDB, 2018 [Paper][Paper][DOI:10.14778/3204028.3204032]
Topic-aware Network Visualisation to Explore Large Email Corpora Tim Repke, Ralf Krestel International Workshop on Big Data Visual Exploration and Analytics (BigVis), 2018 [Paper][Project]
Data Quality The Role of Empiricism Shazia Sadiq, Tamraparni Dasu, Xin Luna Dong, Juliana Freire, Ihab F. Ilyas, Sebastian Link, Renée J. Miller, Felix Naumann, Xiaofang Zhou, Divesh Srivastava SIGMOD Record 46:(4), 2018 [Paper][DOI:10.1145/3186549.3186559]
Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks Tim Repke, Ralf Krestel Proceedings of the European Conference on Information Retrieval (ECIR), 2018 [Paper][Project][DOI:10.1007/978-3-319-76941-7_9]
2017
Metacrate: Organize and Analyze Millions of Data Profiles Sebastian Kruse, David Hahn, Marius Walter, Felix Naumann Proceedings of the ACM on Conference on Information and Knowledge Management (CIKM), 2017 [Paper][DOI:10.1145/3132847.3133180]
Detecting Inclusion Dependencies on Very Many Tables Fabian Tschirschnitz, Thorsten Papenbrock, Felix Naumann Transactions on Database Systems (TODS) 42:(3), 2017 [Paper][DOI:10.1145/3105959]
ssHMM: Extracting Intuitive Sequence-Structure Motifs from High-Throughput RNA-Binding Protein Data David Heller, Ralf Krestel, Uwe Ohler, Martin Vingron, Annalisa Marsico Nucleic Acid Research 45:(19), 2017 [DOI:10.1093/nar/gkx756]
Effect of a Website That Presents Patients' Experiences on Self-Efficacy and Patient Competence of Colorectal Cancer Patients: Web-Based Randomized Controlled Trial M. Jürgen Giesler, Bettina Keller, Tim Repke, Rainer Leonhart, Joachim Weis, Rebecca Muckelbauer, Nina Rieckmann, Jacqueline Müller-Nordhorn, Gabriele Lucius-Hoene, Christine Holmberg Journal of Medical Internet Research (JMIR) 19:(10), 2017 [JMIR][DOI:10.2196/jmir.7639]
Identifying Media Bias by Analyzing Reported Speech Konstantina Lazaridou, Ralf Krestel, Felix Naumann Proceedings of the International Conference on Data Mining (ICDM), 2017 [IEEE][DOI:10.1109/ICDM.2017.119]
Real or Fake? Large-Scale Validation of Identity Leaks Fabian Maschler, Fabio Niephaus, Julian Risch Jahrestagung der Gesellschaft für Informatik (INFORMATIK), 2017 [Paper][DOI:10.18420/in2017_248]
Uncovering Business Relationships: Context-sensitive Relationship Extraction for Difficult Relationship Types Zhe Zuo, Michael Loster, Ralf Krestel, Felix Naumann Lernen, Wissen, Daten, Analysen (LWDA), 2017 [Paper]
How Do Search Engines Work? A Massive Open Online Course with 4000 Participants Ralf Krestel, Julian Risch Lernen, Wissen, Daten, Analysen (LWDA), 2017 [Paper]
Improving Company Recognition from Unstructured Text by using Dictionaries Michael Loster, Zhe Zuo, Felix Naumann, Oliver Maspfuhl, Dirk Thomas Proceedings of the International Conference on Extending Database Technology, 2017 [Paper][DOI:10.5441/002/edbt.2017.82]
What Should I Cite? Cross-Collection Reference Recommendation of Patents and Papers Julian Risch, Ralf Krestel Proceedings of the International Conference on Theory and Practice of Digital Libraries (TPDL), 2017 [Paper][GitHub]
Enabling Change Exploration (Vision) Tobias Bleifuß, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, Vladislav Shkapenyuk, Divesh Srivastava Proceedings of the Fourth International Workshop on Exploratory Search in Databases and the Web (ExploreDB), 2017 [Paper][DOI:10.1145/3077331.3077340]
Fast Approximate Discovery of Inclusion Dependencies Sebastian Kruse, Thorsten Papenbrock, Christian Dullweber, Moritz Finke, Manuel Hegner, Martin Zabel, Christian Zöllner, Felix Naumann Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2017 [Paper]
A Hybrid Approach for Efficient Unique Column Combination Discovery Thorsten Papenbrock, Felix Naumann Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2017 [Paper]
Data-driven Schema Normalization Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2017 [Paper][DOI:10.5441/002/edbt.2017.31]
Das Fachgebiet Informationssysteme am Hasso-Plattner-Institut Felix Naumann, Ralf Krestel Datenbank-Spektrum 17:(1), 2017 [Paper][URL]
What was Hillary Clinton doing in Katy, Texas? Toni Gruetze, Ralf Krestel, Konstantina Lazaridou, Felix Naumann Proceedings of the International Conference on World Wide Web (WWW), 2017 [Paper]
Comparing Features for Ranking Relationships Between Financial Entities Based on Text Tim Repke, Michael Loster, Ralf Krestel Proceedings of the International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets (DSMM), 2017 [Paper][Poster][Slides][DOI:10.1145/3077240.3077252]
Data Profiling (tutorial) Ziawasch Abedjan, Lukasz Golab, Felix Naumann Proceedings of the International Conference on Management of Data (SIGMOD), 2017 [Paper]
2016
Biterm pseudo document topic model for short text Lan Jiang, Hengyang Lu, Ming Xu, Chongjun Wang Proceedings of the International Conference on Tools with Artificial Intelligence (ICTAI), 2016 [Paper][IEEE][DOI:10.1109/ICTAI.2016.0134]
Extraction Of Citation Data From Websites Based On Visual Cues Tim Repke , 2016 [Thesis]
Cluster-based Sorted Neighborhood for Efficient Duplicate Detection Ahmad Samiei, Felix Naumann International Conference on Data Mining Workshops (ICDMW), 2016 [URL]
Approximate Discovery of Functional Dependencies for Large Datasets Tobias Bleifuß, Susanne Bülow, Johannes Frohnhofen, Julian Risch, Georg Wiese, Sebastian Kruse, Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2016 [Paper][DOI:10.1145/2983323.2983781]
Rheem: Enabling Multi-Platform Task Execution (demo) Divy Agrawal, Lamine Ba, Laure Berti-Equille, Sanjay Chawla, Ahmed Elmagarmid, Hossam Hammady, Yasser Idris, Zoi Kaoudi, Zuhair Khayyat, Sebastian Kruse, Mourad Ouzzani, Paolo Papotti, Jorge-Arnulfo Quiané-Ruiz, Nan Tang, Mohammed J. Zaki Proceedings of the ACM Conference on Management of Data (SIGMOD), 2016 [Paper]
Combination of Rule-based and Textual Similarity Approaches to Match Financial Entities Ahmad Samiei, Ioannis Koumarelas, Michael Loster, Felix Naumann Data Science for Macro-Modeling with Financial and Economic Datasets (DSMM), 2016 [Paper][URL]
Holistic Data Profiling: Simultaneous Discovery of Various Metadata Jens Ehrlich, Mandy Roick, Lukas Schulze, Jakob Zwiener, Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2016 [Paper][Paper]
Classification of German Newspaper Comments Christian Godde, Konstantina Lazaridou, Ralf Krestel Lernen, Wissen, Daten, Analysen (LWDA), 2016 [Paper]
Identifying Political Bias in News Articles Konstantina Lazaridou, Ralf Krestel International Conference on Theory and Practice of Digital Libraries. IEEE Technical Committee on Digital Libraries, 2016 [Paper]
RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets Sebastian Kruse, Anja Jentzsch, Thorsten Papenbrock, Zoi Kaoudi, Jorge-Arnulfo Quiane-Ruiz, Felix Naumann Proceedings of the International Conference on Management of Data (SIGMOD), 2016 [Paper][DOI:10.1145/2882903.2915206]
Data Anamnesis: Admitting Raw Data into an Organization Sebastian Kruse, Thorsten Papenbrock, Hazar Harmouch, Felix Naumann Data Engineering Bulletin 39:(2), 2016 [Paper]
A Hybrid Approach to Functional Dependency Discovery Thorsten Papenbrock, Felix Naumann Proceedings of the International Conference on Management of Data (SIGMOD), 2016 [Paper][DOI:10.1145/2882903.2915203]
TextAI: Enhancing TextAE with Intelligent Annotation Support Maximilian Grundke, Johannes Jasper, Mariya Perchyk, Jan Philipp Sachse, Ralf Krestel, Mariana Neves Proceedings of the International Symposium on Semantic Mining in Biomedicine (SMBM), 2016 [Paper][DOI:10.1007/978-3-319-41754-7_18]
Analyzing NIH Funding Patterns over Time with Statistical Text Analysis Jihyun Park, Margaret Blume-Kohout, Ralf Krestel, Eric Nalisnick, Padhraic Smyth Scholarly Big Data: AI Perspectives, Challenges, and Ideas (SBD) Workshop at AAAI, 2016 [Paper]
Proceedings of the Conference "Lernen, Wissen, Daten, Analysen", Potsdam, Germany, September 12-14, 2016 Ralf Krestel, Davide Mottin, Emmanuel Müller CEUR Workshop Proceedings. CEUR-WS.org, 2016
Which Answer is Best? Predicting Accepted Answers in MOOC Forums Maximilian Jenders, Ralf Krestel, Felix Naumann Proceedings of the International Conference Companion on World Wide Web, 2016 [Paper]
Topic Shifts in StackOverflow: Ask it like Socrates Toni Gruetze, Ralf Krestel, Felix Naumann Lecture Notes in Computer Science, 2016 [Paper][DOI:10.1007/978-3-319-41754-7_18]
The Information Systems Group at HPI Felix Naumann, Ralf Krestel SIGMOD Record (2016) [Paper]
Using others experiences. Cancer patients expectations and navigation of a website providing narratives on prostate, breast and colorectal cancer Jennifer Engler, Sandra Adami, Yvonne Adam, Bettina Keller, Tim Repke, Hella Fügemann, Gabriele Lucius-Hoene, Jacqueline Müller-Nordhorn, Christine Holmberg Patient Education and Counseling 99:(8), 2016 [ScienceDirect][DOI:10.1016/j.pec.2016.03.015]
CohEEL: Coherent and Efficient Named Entity Linking through Random Walks Toni Gruetze, Gjergji Kasneci, Zhe Zuo, Felix Naumann Web Semantics: Science, Services and Agents on the World Wide Web 37:(C), 2016 [Paper][DOI:10.1016/j.websem.2016.03.001]
Efficient Order Dependency Discovery Philipp Langer, Felix Naumann The VLDB Journal 25:(2), 2016 [DOI:10.1007/s00778-015-0412-3]
Data Profiling (tutorial) Lukasz Golab Ziawasch Abedjan, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2016 [Paper]
2015
Social Media Story Telling Patrick Hennig, Philipp Berger, Christian Dullweber, Moritz Finke, Fabian Maschler, Julian Risch, Christoph Meinel Proceedings of the International Conference on Social Computing and Networking (SocialCom), 2015 [Paper][DOI:10.1109/SmartCity.2015.84]
Ergonomic Interaction for Touch Floors Dominik Schmidt, Johannes Frohnhofen, Sven Knebel, Florian Meinel, Mariya Perchyk, Julian Risch, Jonathan Striebel, Julia Wachtel, Patrick Baudisch Proceedings of the Conference on Human Factors in Computing Systems (CHI), 2015 [Paper][DOI:10.1145/2702123.2702254]
Tweet-Recommender: Finding Relevant Tweets for News Articles Ralf Krestel, Thomas Werkmeister, Timur Pratama Wiradarma, Gjergji Kasneci Proceedings of the International World Wide Web Conference (WWW), 2015 [Paper]
Progressive Duplicate Detection Thorsten Papenbrock, Arvid Heise, Felix Naumann IEEE Transactions on Knowledge and Data Engineering (TKDE) 27:(5), 2015 [Paper][DOI:10.1109/TKDE.2014.2359666]
Scaling Out the Discovery of Inclusion Dependencies Sebastian Kruse, Thorsten Papenbrock, Felix Naumann Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2015 [Paper]
Divide & Conquer-based Inclusion Dependency Discovery Thorsten Papenbrock, Sebastian Kruse, Jorge-Arnulfo Quiane-Ruiz, Felix Naumann PVLDB 8:(7), 2015 [Paper][DOI:10.14778/2752939.2752946]
Data Profiling with Metanome Thorsten Papenbrock, Tanja Bergmann, Moritz Finke, Jakob Zwiener, Felix Naumann PVLDB 8:(12), 2015 [Paper][DOI:10.14778/2824032.2824086]
Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms Thorsten Papenbrock, Jens Ehrlich, Jannik Marten, Tommy Neubert, Jan-Peer Rudolph, Martin Schönberg, Jakob Zwiener, Felix Naumann PVLDB 8:(10), 2015 [Paper][DOI:10.14778/2794367.2794377]
Online Temporal Summarization of News Events Tobias Schubotz, Ralf Krestel Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 [Paper]
Learning Temporal Tagging Behaviour Toni Gruetze, Gary Yao, Ralf Krestel Proceedings of the International Conference on World Wide Web Companion (WWW), 2015 [Paper][DOI:10.1145/2740908.2741701]
How to Stay Up-to-date on Twitter with General Keywords Mandy Roick, Maximilian Jenders, Ralf Krestel Proceedings of the LWA Workshops: KDML, FGWM, IR, and FGDB, 2015 [Paper]
A Serendipity Model For News Recommendation Maximilian Jenders, Thorben Lindhauer, Gjergji Kasneci, Ralf Krestel, Felix Naumann KI: Advances in Artificial Intelligence - Annual German Conference on AI, 2015 [Paper]
Uniqueness, Density, and Keyness: Exploring Class Hierarchies Anja Jentzsch, Hannes Mühleisen, Felix Naumann In Proceedings of International Workshop on Consuming Linked Data (COLD), ISWC, 2015 [Paper]
Exploring Linked Data Graph Structures Anja Jentzsch, Christian Dullweber, Pierpaolo Troiano, Felix Naumann Proceedings of the International Semantic Web Conference, Posters and Demos (ISWC), 2015 [Paper]
SOFA: An Extensible Logical Optimizer for UDF-heavy Data Flows Astrid Rheinländer, Arvid Heise, Fabian Hueske, Ulf Leser, Felix Naumann Information Systems (2015)
Estimating Data Integration and Cleaning Effort Sebastian Kruse, Paolo Papotti, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2015 [Paper]
2014
Multi-label emotion classification for tweets in weibo: Method and application Jun Yang, Lan Jiang, Chongjun Wang, Junyuan Xie Proceedings of the International Conference on Tools with Artificial Intelligence (ICTAI), 2014 [IEEE][DOI:10.1109/ICTAI.2014.71]
Versatile optimization of UDF-heavy data flows with SOFA Astrid Rheinländer, Martin Beckmann, Anja Kunkel, Arvid Heise, Thomas Stoltmann, Ulf Leser Proceedings of the International Conference on Management of Data (SIGMOD), 2014 [Paper][DOI:10.1145/2588555.2594517]
The Stratosphere Platform for Big Data Analytics Alexander Alexandrov, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, Felix Naumann, Mathias Peters, Astrid Rheinländer, Matthias J. Sax, Sebastian Schelter, Mareike Höger, Kostas Tzoumas, Daniel Warneke The VLDB Journal 23:(6), 2014 [Paper]
LODOP - Multi-Query Optimization for Linked Data Profiling Queries Benedikt Forchhammer, Anja Jentzsch, Felix Naumann Proceedings of the Extended Semantic Web Conference (ESWC), 2014 [Paper]
Modeling human newspaper readers: The Fuzzy Believer approach Ralf Krestel, Sabine Bergler, René Witte Natural Language Engineering 20:(2), 2014 [Paper][DOI:10.1017/S1351324912000289]
Detecting Unique Column Combinations on Dynamic Data Ziawasch Abedjan, Jorge-Arnulfo Quanie-Ruiz, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2014 [Paper]
Data Perspective in Process Choreographies: Modeling and Execution Andreas Meyer, Luise Pufahl, Kimon Batoulis, Sebastian Kruse, Thorben Lindhauer, Thomas Stoff, Dirk Fahland, Mathias Weske International Conference on Advanced Information Systems Engineering, 2014
Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Matthias Kohnen, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci International Workshop on Data Engineering meets the Semantic Web (DESWeb), 2014 [Paper]
Bootstrapping Wikipedia to Answer Ambiguous Person Name Queries Toni Gruetze, Gjergji Kasneci, Zhe Zuo, Felix Naumann International Workshop on Information Integration on the Web (IIWeb), 2014 [Paper]
DFD: Efficient Discovery of Functional Dependencies Ziawasch Abedjan, Patrick Schulze, Felix Naumann In Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2014 [Paper]
Profiling and Mining RDF Data with ProLOD++ Ziawasch Abedjan, Toni Gruetze, Anja Jentzsch, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2014 [Paper]
Identifying and Determining SPARQL Endpoint Characteristics Johannes Lorey International Journal of Web Information Systems 10:(3), 2014
Semi-Supervised Consensus Clustering: Reducing Human Effort Tobias Vogel, Felix Naumann Proceedings of the International Workshop on Data Integration and Applications, 2014 [Paper]
DBpedia A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer Semantic Web Journal (2014)
BEL: Bagging for Entity Linking Zhe Zuo, Gjergji Kasneci, Toni Gruetze, Felix Naumann 25th International Conference on Computational Linguistics (COLING), 2014 [Paper]
Estimating the Number and Sizes of Fuzzy-Duplicate Clusters Arvid Heise, Gjergji Kasneci, Felix Naumann Proceedings of the Conference on Information and Knowledge Management (CIKM), 2014 [Paper]
Amending RDF Entities with New Facts Ziawasch Abedjan, Felix Naumann Proceedings of the Extended Semantic Web Conference (ESWC), 2014 [Paper]
Reach for Gold: An Annealing Standard to Evaluate Duplicate Detection Results Tobias Vogel, Arvid Heise, Uwe Draisbach, Dustin Lange, Felix Naumann Journal of Data and Information Quality (JDIQ) 5:(1-2), 2014 [Paper]
2013
Storing and Provisioning Linked Data as a Service Johannes Lorey Proceedings of the Extended Semantic Web Conference (ESWC), 2013 [Paper]
Improving RDF Data through Association Rule Mining Ziawasch Abedjan, Felix Naumann Datenbank-Spektrum (Special Issue on RDF Data Management) 13:(2), 2013 [Paper]
Detecting SPARQL Query Templates for Data Prefetching Johannes Lorey, Felix Naumann Proceedings of the Extended Semantic Web Conference (ESWC), 2013 [Paper]
Caching and Prefetching Strategies for SPARQL Queries Johannes Lorey, Felix Naumann Proceedings of the Extended Semantic Web Conference (ESWC), 2013 [Paper]
Analyzing and Predicting Viral Tweets Maximilian Jenders, Gjergji Kasneci, Felix Naumann Proceedings of the International World Wide Web Conference (WWW), 2013 [Paper]
Applying Stratosphere for Big Data Analytics Marcus Leich, Jochen Adamek, Moritz Schubotz, Arvid Heise, Astrid Rheinlander, Volker Markl Database Systems for Business, Technology, and Web (BTW), 2013 [Paper]
Topic modeling for expert finding using latent dirichlet allocation Saeedeh Momtazi, Felix Naumann WIREs Data Mining and Knowledge Discovery 3:(5), 2013 [Paper]
Synonym Analysis for Predicate Expansion Ziawasch Abedjan, Felix Naumann Proceedings of the Extended Semantic Web Conference (ESWC), 2013 [Paper]
SPARQL Endpoint Metrics for Quality-Aware Linked Data Consumption Johannes Lorey Proceedings of the International Conference on Information Integration and Web-based Applications & Services (iiWAS), 2013 [Paper]
Cross-lingual Entity Matching and Infobox Alignment in Wikipedia Daniel Rinser, Dustin Lange, Felix Naumann Information Systems (IS) 38:(6), 2013 [Paper]
Ein Datenbankkurs mit 6000 Teilnehmern - Erfahrungen auf der openHPI MOOC Plattform Felix Naumann, Maximilian Jenders, Thorsten Papenbrock Informatik-Spektrum 37:(12), 2013 [Paper][DOI:10.1007/s00287-013-0750-8]
Duplicate Detection on GPUs Benedikt Forchhammer, Thorsten Papenbrock, Thomas Stening, Sven Viehmeier, Uwe Draisbach, Felix Naumann Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW), 2013 [Paper]
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2013
Scalable Discovery of Unique Column Combinations Arvid Heise, Jorge-Arnulfo Quiane-Ruiz, Ziawasch Abedjan, Anja Jentzsch, Felix Naumann PVLDB, 2013 [Paper][Slides][doi]
Caching and Prefetching Strategies for SPARQL Queries Johannes Lorey, Felix Naumann Proceedings of the International Workshop on Usage Analysis and the Web of Data (USEWOD), 2013 [Paper]
Cost-Aware Query Planning for Similarity Search Dustin Lange, Felix Naumann Information Systems (IS) 38:(4), 2013 [Paper]
Bulk Sorted Access for Efficient Top-k Retrieval Dustin Lange, Felix Naumann Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM), 2013 [Paper]
Systematic ETL Management Experiences with High-Level Operators Alexander Albrecht, Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2013 [Paper]
SOFA: An Extensible Logical Optimizer for UDF-heavy Dataflows Astrid Rheinländer, Arvid Heise, Fabian Hueske, Ulf Leser, Felix Naumann , 2013 []
On Choosing Thresholds for Duplicate Detection Uwe Draisbach, Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2013 [Paper]
Data Profiling Revisited Felix Naumann SIGMOD Record 32:(4), 2013 [Paper]
2012
Efficient Similarity Search in Very Large String Sets Dandy Fenz, Dustin Lange, Astrid Rheinländer, Felix Naumann, Ulf Leser Proceedings of the International Conference on Scientific and Statistical DatabaseManagement (SSDBM), 2012 [Paper]
Schema Decryption for Large Extract-Transform-Load Systems Alexander Albrecht, Felix Naumann Proceedings of the International Conference on Conceptual Modeling (ER), 2012 [Paper]
Integrating Open Government Data with Stratosphere for more Transparency Arvid Heise, Felix Naumann Web Semantics: Science, Services and Agents on the World Wide Web 14:(1), 2012 [Paper][DOI:10.1016/j.websem.2012.02.002]
The Data Analytics Group at the Qatar Computing Research Institute George Beskales, Gautam Das, Ahmed K. Elmagarmid, Ihab F. Ilyas, Felix Naumann, Mourad Ouzzani, Paolo Papotti, Jorge Quiane-Ruiz, Nan Tang SIGMOD Record 41:(4), 2012
Automatic Blocking Key Selection for Duplicate Detection based on Unigram Combinations Tobias Vogel, Felix Naumann Proceedings of the International Workshop on Quality in Databases (QDB) in conjunction with VLDB, 2012 [Paper]
Scalable Similarity Search with Dynamic Similarity Measures Martin Köppelmann, Dustin Lange, Claudia Lehmann, Marika Marszalkowski, Felix Naumann, Peter Retzlaff, Sebastian Stange, Lea Voget Proceedings of the International Workshop on Ranking in Databases (DBRank) in conjunction with VLDB, 2012 [Paper]
Scalable Iterative Graph Duplicate Detection Melanie Herschel, Felix Naumann, Sascha Szott, Maik Taubert Transactions on Knowledge and Data Engineering (TKDE) 24:(11), 2012
Latent Topics in Graph-Structured Data Christoph Böhm, Gjergji Kasneci, Felix Naumann Proceedings of the Conference on Information and Knowledge Management (CIKM), 2012 [Paper]
Discovering Conditional Inclusion Dependencies Jana Bauckmann, Ziawasch Abedjan, Heiko Müller, Ulf Leser, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2012
Understanding Cryptic Schemata in Large Extract-Transform-Load Systems Alexander Albrecht, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2012
Fine-grained German Sentiment Analysis on Social Media Saeedeh Momtazi Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2012
Fusion Cubes: Towards Self-Service Business Intelligence Alberto Abelló, Jérôme Darmont, Lorena Etcheverry, Matteo Golfarelli, Jose-Norberto Mazón, Felix Naumann, Torben Bach Pedersen, Stefano Rizzi, Juan Trujillo, Panos Vassiliadis, Gottfried Vossen International Journal of Data Warehousing and Mining (IJDWM) 9:(2), 2012 [DOI:10.4018/jdwm.2013040104]
Holistic and Scalable Ontology Alignment for Linked Open Data Toni Gruetze, Christoph Böhm, Felix Naumann Proceedings of the Linked Data on the Web (LDOW) Workshop at the International World Wide Web Conference (WWW), 2012 [Paper]
Bayesian online clustering of eye movement data Enkelejda Tafaj, Gjergji Kasneci, Wolfgang Rosenstiel, Martin Bogdan Proceedings of the Symposium on Eye-Tracking Research and Applications, 2012 [Paper][DOI:10.1145/2168556.2168617]
Adaptive Windows for Duplicate Detection Uwe Draisbach, Felix Naumann, Sascha Szott, Oliver Wonneberg Proceedings of the International Conference on Data Engineering (ICDE), 2012 [Paper]
GovWILD: Integrating Open Government Data for Transparency (demo) Christoph Böhm, Markus Freitag, Arvid Heise, Claudia Lehmann, Andrina Mascher, Felix Naumann, Mauricio Hernandez, Vuk Ercegovac, Peter Haase Proceedings of the International World Wide Web Conference (WWW), 2012
Reconciling Ontologies and the Web of Data Ziawasch Abedjan, Johannes Lorey, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2012
Covering or complete? : discovering conditional inclusion dependencies Jana Bauckmann, Ziawasch Abedjan, Ulf Leser, Heiko Müller, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2012
LINDA: Distributed Web-of-Data-Scale Entity Matching Christoph Böhm, Gerard de Melo, Felix Naumann, Gerhard Weikum Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii, 2012
Partitionierung zur effizienten Duplikaterkennung in relationalen Daten Uwe Draisbach Ausgezeichnete Arbeiten zur Informationsqualität. Springer Vieweg, 2012
Scalable Peer-to-Peer-based RDF Management Christoph Böhm, Daniel Hefenbrock, Felix Naumann Proceedings of the Int. Conference on Semantic Systems, 2012 [Paper]
Meteor/Sopremo: An Extensible Query Language and Operator Model Arvid Heise, Astrid Rheinländer, Marcus Leich, Ulf Leser, Felix Naumann Proceedings of the International Workshop on End-to-end Management of Big Data (BigData) in conjunction with VLDB, 2012 [Paper]
Adaptive Windows for Duplicate Detection Uwe Draisbach, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2012 [Paper]
2011
Advancing the Discovery of Unique Column Combinations Ziawasch Abedjan, Felix Naumann Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2011 [Paper]
RDF Ontology (Re-)Engineering through Large-scale Data Mining Johannes Lorey, Ziawasch Abedjan, Felix Naumann, Christoph Böhm Billion Triples Challenge (BTC) at the International Semantic Web Conference (ISWC), 2011 [Paper]
Black Swan: Augmenting Statistics with Event Data Johannes Lorey, Felix Naumann, Benedikt Forchhammer, Andrina Mascher, Peter Retzlaff, Armin ZamaniFarahani, Soeren Discher, Cindy Faehnrich, Stefan Lemme, Thorsten Papenbrock, Robert Christoph Peschel, Stephan Richter, Thomas Stening, Sven Viehmeier Proceedings of the Conference on Information and Knowledge Management (CIKM), 2011 [Paper]
Instance-based one-to-some Assignment of Similarity Measures to Attributes Tobias Vogel, Felix Naumann Proceedings of the International Conference on Cooperative Information Systems (CoopIS), 2011 [Paper]
SPRINT: ranking search results by paths Christoph Böhm, Eyk Kny, Benjamin Emde, Ziawasch Abedjan, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2011 [URL]
Advancing the Discovery of Unique Column Combinations Ziawasch Abedjan, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2011
Frequency-aware Similarity Measures Dustin Lange, Felix Naumann Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2011 [Paper]
Context and Target Configurations for Mining RDF Data Ziawasch Abedjan, Felix Naumann International Workshop on Search & Mining Entity-Relationship Data (SMER), 2011
A Generalization of Blocking and Windowing Algorithms for Duplicate Detection Uwe Draisbach, Felix Naumann Proceedings of the International Conference on Data and Knowledge Engineering (ICDKE), 2011 [Paper]
Efficient Similarity Search: Arbitrary Similarity Measures, Arbitrary Composition Dustin Lange, Felix Naumann Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2011 [Paper]
Kurz erklärt: Datenfusion Jens Bleiholder, Felix Naumann Datenbank-Spektrum 11:(1), 2011
Eliminating NULLs with Subsumption and Complementation Jens Bleiholder, Melanie Herschel, Felix Naumann Data Engineering Bulletin 34:(3), 2011
Improving Service Discovery through Enriched Service Descriptions Mohammed AbuJarour, Felix Naumann Datenbanksysteme für Business, Technologie und Web (BTW), 2011
Creating voiD Descriptions for Web-scale Data Christoph Böhm, Johannes Lorey, Felix Naumann Journal of Web Semantics: Science, Services and Agents on the World Wide Web 9:(3), 2011 [Paper][DOI:10.1016/j.websem.2011.06.001]
2010
Profiling linked open data with ProLOD Christoph Böhm, Felix Naumann, Ziawasch Abedjan, Dandy Fenz, Toni Gruetze, Daniel Hefenbrock, Matthias Pohl, David Sonnabend Proceedings of the International Conference on Data Engineering (ICDE), 2010 [Paper]
Efficient and Exact Computation of Inclusion Dependencies for Data Integration Jana Bauckmann, Ulf Leser, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2010 [Paper]
Extracting structured information from Wikipedia articles to populate infoboxes Dustin Lange, Christoph Böhm, Felix Naumann Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2010 [Paper]
An Introduction to Duplicate Detection Felix Naumann, Melanie Herschel Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2010
Dynamic tags for dynamic data web services Mohammed AbuJarour, Felix Naumann Proceedings of the Workshop on Emerging Web Services Technology (WEWST), 2010
Proceedings of the 13th International Conference on Extending Database Technology (EDBT), Lausanne, Switzerland Xin Luna Dong, Felix Naumann ACM International Conference Proceeding Series. ACM, 2010
DuDe: The Duplicate Detection Toolkit Uwe Draisbach, Felix Naumann Proceedings of the International Workshop on Quality in Databases (QDB), 2010 [Paper]
Towards Granular Data Placement Strategies for Cloud Platforms Johannes Lorey, Felix Naumann Proceedings of the International Conference on Granular Computing (GrC), 2010 [Paper]
Towards a diamond SOA operational model Mohammed AbuJarour, Felix Naumann IEEE International Conference on Service-Oriented Computing and Applications (SOCA), 2010
13th International Workshop on the Web and Databases: WebDB 2010 (workshop report) Xin Luna Dong, Felix Naumann SIGMOD Record 39:(3), 2010
Proceedings of the 13th International Workshop on the Web and Databases (WebDB), Indianapolis, IN Xin Luna Dong, Felix Naumann ACM, 2010
Extracting structured information from Wikipedia articles to populate infoboxes Dustin Lange, Christoph Böhm, Felix Naumann Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam, 2010 [Paper]
Collecting, Annotating, and Classifying Public Web Services Mohammed AbuJarour, Felix Naumann, Mircea Craculeac On the Move to Meaningful Internet Systems: OTM - Confederated International Conferences: CoopIS, IS, DOA and ODBASE, 2010
Linking open government data: what journalists wish they had known Christoph Böhm, Felix Naumann, Markus Freitag, Stefan George, Norman Höfler, Martin Köppelmann, Claudia Lehmann, Andrina Mascher, Tobias Schmidt Proceedings the International Conference on Semantic Systems (I-SEMANTICS), Graz, Austria, 2010 [URL]
Creating voiD Descriptions for Web-Scale Data Christoph Böhm, Johannes Lorey, Dandy Fenz, Eyk Kny, Matthias Pohl, Felix Naumann Billion Triples Challenge (BTC) at the International Semantic Web Conference (ISWC), 2010 [Paper]
Complement union for data integration Jens Bleiholder, Sascha Szott, Melanie Herschel, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2010 [Paper]
Graph-based concept identification and disambiguation for enterprise search Falk Brauer, Michael Huber, Gregor Hackenbroich, Ulf Leser, Felix Naumann, Wojciech M. Barczynski Proceedings of the International Conference on World Wide Web (WWW), 2010
Self-Adaptive Data Quality Web Services Tobias Vogel Grundlagen von Datenbanken, 2010 [Paper]
Subsumption and complementation as data fusion operators Jens Bleiholder, Sascha Szott, Melanie Herschel, Frank Kaufer, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2010
2009
Graph-Based Ontology Construction from Heterogeneous Evidences Christoph Böhm, Philip Groth, Ulf Leser Proceedings of the International Semantic Web Conference (ISWC), 2009
Data fusion - Resolving Data Conflicts for Integration (tutorial) Xin Luna Dong, Felix Naumann PVLDB 2:(2), 2009
A Machine Learning Approach to Foreign Key Discovery Alexandra Rostin, Oliver Albrecht, Jana Bauckmann, Felix Naumann, Ulf Leser Proceedings of the International Workshop on the Web and Databases (WebDB), 2009 [Paper]
A Comparison and Generalization of Blocking and Windowing Algorithms for Duplicate Detection Uwe Draisbach, Felix Naumann Proceedings of the International Workshop on Quality in Databases (QDB), 2009 [Paper]
POSR: A Comprehensive System for Aggregating and Using Web Services (demo) Mohammed AbuJarour, Mircea Craculeac, Falko Menge, Tobias Vogel, Jan-Felix Schwarz Proceedings of the IEEE Services Cup at IEEE International Conference on Web Services (ICWS), 2009 [Paper]
Encapsulating Multi-stepped Web Forms as Web Services Tobias Vogel, Frank Kaufer, Felix Naumann Proceedings of the International Conference on Service-Oriented Computing (ICSOC), 2009 [Paper]
METL: Managing and Integrating ETL Processes Alexander Albrecht, Felix Naumann Proceedings of the VLDB PhD Workshop, 2009
Guest Editorial for the Special Issue on Data Quality in Databases Felix Naumann, Louiqa Raschid Journal of Data and Information Quality (JDIQ) 1:(2), 2009
2008
Data fusion Jens Bleiholder, Felix Naumann ACM Computing Surveys 41:(1), 2008
Scaling up duplicate detection in graph data Melanie Herschel, Felix Naumann Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2008 [Paper]
Managing ETL Processes Alexander Albrecht, Felix Naumann Proceedings of the International Workshop on New Trends in Information Integration, (NTII), Auckland, New Zealand, 2008
A research agenda for query processing in large-scale peer data management systems Katja Hose, Armin Roth, Andre Zeitz, Kai-Uwe Sattler, Felix Naumann Information Systems (IS) 33:(7-8), 2008
Automated data augmentation services using text mining, data cleansing and web crawling techniques Matthias Jacob, Alexander Kuscher, Christoph Thiele, Max Plauth IEEE Congress on Services, 2008 [IEEE]
2007
Efficiently Detecting Inclusion Dependencies Jana Bauckmann, Ulf Leser, Felix Naumann, Veronique Tietz Proceedings of the International Conference on Data Engineering (ICDE), 2007 [Paper]
Schema- und Metadatenmanagement in Peer Data Management Systemen Felix Naumann Datenbanksysteme in Business, Technologie und Web (BTW), Workshop Proceedings, 2007 [Paper]
A Classification of Schema Mappings and Analysis of Mapping Tools Frank Legler, Felix Naumann Proceedings of Datenbanksysteme in Business, Technologie und Web (BTW), 2007 [Paper]
FuSem - Exploring Different Semantics of Data Fusion (demo) Jens Bleiholder, Karsten Draba, Felix Naumann Proceedings of the International Conference on Very Large Data Bases (VLDB), 2007 [Paper]
System P: Completeness-driven Query Answering in Peer Data Management Systems (demo) Armin Roth, Felix Naumann Datenbanksysteme in Business, Technologie und Web (BTW), 2007 [Paper]
Datenqualität Felix Naumann Informatik-Spektrum 30:(1), 2007 [Paper]
Emergent Data Quality Annotation And Visualization Paul Führing, Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2007 [Paper]
Rule-Based Measurement Of Data Quality In Nominal Data Jochen Hipp, Markus Müller, Johannes Hohendorff, Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2007 [Paper]
Answering Top K Queries Efficiently with Overlap of Answers in Sources or Source Paths Louiqa Raschid, Maria Esther Vidal, Yao Wu, Felix Naumann, Jens Bleiholder Proceedings of the International Workshop on Information Integration on the Web (IIWeb), 2007 [Paper]
Peer-Daten-Management-Systeme - PDMS Felix Naumann, Armin Roth Datenbank-Spektrum (2007) [Paper]
Proceedings of the 5th International Workshop on Quality in Databases (QDB) Ganti Venkatesh, Felix Naumann , 2007
Networked PIM using PDMS Alexander Albrecht, Felix Naumann Proceedings of the International Workshop Networking Meets Databases (NetDB), 2007 [Paper]
2006
Conflict Handling Strategies in an Integrated Information System Jens Bleiholder, Felix Naumann Proceedings of the International Workshop on Information Integration on the Web (IIWeb), 2006 [Paper]
Query Planning in the Presence of Overlapping Sources Jens Bleiholder, Samir Khuller, Felix Naumann, Louiqa Raschid, Yao Wu Proceedings of the International Conference on Extending Database Technology (EDBT), 2006 [Paper]
XML Duplicate Detection Using Sorted Neighborhoods Sven Puhlmann, Melanie Weis, Felix Naumann Proceedings of the International Conference on Extending Database Technology (EDBT), 2006 [Paper]
Assessing the Completeness of Sensor Data Jit Biswas, Felix Naumann, Qiang Qiu Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA), 2006 [Paper]
Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies Felix Naumann, Alexander Bilke, Jens Bleiholder, Melanie Weis Data Engineering Bulletin 29:(2), 2006 [Paper]
XStruct: Efficient Schema Extraction from Multiple and Large XML Documents Jan Hegewald, Felix Naumann, Melanie Weis Proceedings of the International Conference on Data Engineering (ICDE), 2006 [Paper]
Efficiently Computing Inclusion Dependencies for Schema Discovery Jana Bauckmann, Ulf Leser, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2006 [Paper]
Proceedings of the Data Integration in the Life Sciences Workshop (DILS) Ulf Leser, Felix Naumann, Barbara Eckmann Lecture Notes in Computer Science. Springer, 2006
System P: Query Answering in PDMS under Limited Resources Armin Roth, Felix Naumann, Tobias Hübner, Martin Schweigert Proceedings of the International Workshop on Information Integration on the Web (IIWeb), 2006 [Paper]
Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen Ulf Leser, Felix Naumann dpunkt, 2006 [Paper]
Detecting Duplicates in Complex XML Data Melanie Weis, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2006 [Paper]
Information Quality: How Good are Off-the-Shelf DBMS? Felix Naumann, Mary Roth Information Quality Management: Theory and Applications. Idea Group Inc., 2006
2005
(Almost) Hands-Off Information Integration for the Life Sciences Ulf Leser, Felix Naumann Proceedings of the International Conference on Innovative Database Research (CIDR), 2005 [Paper]
Self-Extending Peer Data Management Ralf Heese, Sven Herschel, Felix Naumann, Armin Roth Datenbanksysteme in Business, Technologie und Web (BTW), Karlsruhe, Germany, 2005 [Paper]
Enhancing the Semantics of Links and Paths in Life Science Sources Stephan Heymann, Felix Naumann, Peter Rieger, Louiqa Raschid ICDT Workshop on Database Issues in Biological Databases (DBiBD), 2005 [Paper]
Declarative Data Fusion - Syntax, Semantics, and Implementation Jens Bleiholder, Felix Naumann Proceedings of the International Conference on Advances in Databases and Information Systems (ADBIS), 2005 [Paper]
Proceedings of the 2005 International Conference on Information Quality (MIT IQ Conference), Sponsored by Lockheed Martin, MIT, Cambridge, MA, USA, November 10-12, 2006
MIT, 2005
Ein Data-Quality-Wettbewerb Michael Mielke, Heiko Müller, Felix Naumann Datenbank-Spektrum (2005) [Paper]
A Data Model and Query Language to Explore Enhanced Links and Paths in Life Science Sources George A. Mihaila, Felix Naumann, Louiqa Raschid, Maria-Esther Vidal Proceedings of the International Workshop on the Web & Databases (WebDB), 2005 [Paper]
DogmatiX Tracks down Duplicates in XML Melanie Weis, Felix Naumann Proceedings of the ACM International Conference on Management of Data (SIGMOD), 2005 [Paper]
Benefit and Cost of Query Answering in PDMS Armin Roth, Felix Naumann Proceedings of the Databases, Information Systems, and Peer-to-Peer Computing Workshop (DBISP2P) Seoul, Korea, 2005 [Paper]
Fuzzy Duplicate Detection on XML Data Melanie Weis Proceedings of the VLDB PhD workshop, 2005 [Paper]
Schema Matching using Duplicates Alexander Bilke, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2005 [Paper]
Automatic Data Fusion with HumMer (demo) Alexander Bilke, Jens Bleiholder, Christoph Böhm, Karsten Draba, Felix Naumann, Melanie Weis Proceedings of the International Conference on Very Large Data Bases (VLDB), 2005 [Paper]
A Duplicate Detection Benchmark for XML (and Relational) Data Melanie Weis, Felix Naumann, Franziska Brosy Proceedings of the SIGMOD International Workshop on Information Quality for Information Systems (IQIS), 2005 [Paper]
Beitragsband zum Studierenden-Programm bei der 11. Fachtagung "Datenbanken für Business, Technologie and Web", GI Fachbereich Datenbanken und Informationssysteme, Karlsruhe Hagen Höpfner, Gunter Saaske, Felix Naumann, Andreas Heuer Universität Magdeburg, Fakultät für Informatik, 2005
Clio: A Schema Mapping Tool for Information Integration Mauricio A. Hernández, Lucian Popa, Howard Ho, Felix Naumann Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Networks (ISPAN), 2005
2004
Information Quality: How Good Are Off-The-Shelf DBMS? Felix Naumann, Mary Roth Proceedings of the International Conference on Information Quality (ICIQ), Cambridge, MA, 2004 [Paper]
Proceedings of the International Workshop on Information Quality in Information Systems (SIGMOD Workshop) Felix Naumann, Monica Scannapieco ACM, 2004
Labeling and Enhancing Life Sciences Links Stephan Heymann, Felix Naumann, Louiqa Raschid, Peter Rieger Proceedings of the International IEEE Computer Society Computational Systems Bioinformatics Conference (CSB), 2004 [Paper]
Eine Übung zur Vorlesung Informationsintegration Felix Naumann, Jens Bleiholder, Melanie Weis Datenbank-Spektrum (2004) [Paper]
Informationsintegration Felix Naumann Öffentliche Vorlesung an der Humboldt-Universität zu Berlin, 2004
Qualitäts- und Semantik-gesteuerte Anfragebearbeitung für Peer-basierte Datenmanagementsysteme (PDMS) Armin Roth, Felix Naumann INFORMATIK - Band 1, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI), Ulm, Germany, 2004 [Paper]
Querying Web-Accessible Life Science Sources: Which paths to choose? Jens Bleiholder, Felix Naumann, Louiqa Raschid, Maria Esther Vidal Proceedings of the International Workshop on Information Integration on the Web (IIWeb), 2004 [Paper]
Links and Paths through Life Sciences Data Sources Zoé Lacroix, Hyma Murthy, Felix Naumann, Louiqa Raschid Humboldt-Universität zu Berlin, Institut für Informatik, 2004 [Paper]
BioFast: Challenges in Exploring Linked Life Science Sources Jens Bleiholder, Zoé Lacroix, Hyma Murthy, Felix Naumann, Louiqa Raschid, Maria-Esther Vidal SIGMOD Record 33:(2), 2004 [Paper]
Links and Paths through Life Sciences Data Sources Zoé Lacroix, Hyma Murthy, Felix Naumann, Louiqa Raschid Proceedings of the International WorkshopData Integration in the Life Sciences (DILS), 2004 [Paper]
FUSE BY: Syntax und Semantik zur Informationsfusion in SQL Jens Bleiholder, Felix Naumann INFORMATIK, Band 1, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI), 2004 [Paper]
Detecting Duplicate Objects in XML Documents Melanie Weis, Felix Naumann International Workshop on Information Quality in Information Systems (IQIS), 2004 [Paper]
Completeness of integrated information sources Felix Naumann, Johann Christoph Freytag, Ulf Leser Information Systems (IS) 29:(7), 2004
2003
Exploring Life Sciences Data Sources Zoé Lacroix, Felix Naumann, Louiqa Raschid, Maria-Esther Vidal Proceedings of Workshop on Information Integration on the Web (IIWeb), 2003 [Paper]
Information Quality Assessment and Measurement Felix Naumann, Cinzia Capiello, Vipul Kashyap, Gunter Saake Data Quality on the Web, 2003
Super-Fast XML Wrapper Generation in DB2: A Demonstration Vanja Josifovski, Sabine Massmann, Felix Naumann Proceedings of the International Conference on Data Engineering (ICDE), 2003 [Paper]
Object Identification Quality Mattis Neiling, Steffen Jurk, Hans-J. Lenz, Felix Naumann Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS), 2003 [Paper]
Semantic Overlay Clusters within Super-Peer Networks Alexander Löser, Felix Naumann, Wolf Siberski, Wolfgang Nejdl, Uwe Thaden First International Workshop on Databases, Information Systems, and Peer-to-Peer Computing (DBISP2P), 2003 [Paper]
Data Quality in Genome Databases Heiko Müller, Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2003 [Paper]
Qualitätsgesteuerte Anfragebearbeitung für Integrierte Informationssysteme Felix Naumann it - Information Technology 45:(1), 2003 [Paper]
Completeness of Information Sources Felix Naumann, Johann-Christoph Freytag, Ulf Leser Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS), 2003 [Paper]
2002
Schema Management Periklis Andritsos, Ronald Fagin, Ariel Fuxman, Laura M. Haas, Mauricio A. Hernández, C. T. Howard Ho, Anastasios Kementsietsidis, Renée J. Miller, Felix Naumann, Lucian Popa, Yannis Velegrakis, Charlotte Vilarem, Ling-Ling Yan Data Engineering Bulletin 25:(3), 2002 [Paper]
Declarative Data Merging with Conflict Resolution Felix Naumann, Matthias Häussler Proceedings of the International Conference on Information Quality (ICIQ), 2002 [Paper]
Quality-Driven Query Answering for Integrated Information Systems Felix Naumann Lecture Notes in Computer Science. Springer, 2002
Mapping XML and Relational Schemas with Clio (demo) Mauricio A. Hernández, Lucian Popa, Yannis Velegrakis, Renée J. Miller, Felix Naumann, Ching-Tien Ho Proceedings of the International Conference on Data Engineering (ICDE), 2002 [Paper]
Schema Mapping and Data Integration with Clio (demo) Barbara Eckman, Mauricio Hernandez, Howard Ho, Felix Naumann, Lucian Popa Intelligent Systems for Molecular Biology (ISMB), 2002 [Paper]
Attribute Classification Using Feature Analysis Felix Naumann, Ching-Tien Ho, Xuqing Tian, Laura M. Haas, Nimrod Megiddo Proceedings of the International Conference on Data Engineering (ICDE), 2002 [Paper]
Attribute Classification Using Feature Analysis Felix Naumann, Ching-Tien Ho, Xuqing Tian, Laura Haas, Nimrod Megiddo IBM Almaden Research Center, 2002 [Paper]
2001
From Databases to Information Systems - Information Quality Makes the Difference Felix Naumann Proceedings of the International Conference on Information Quality (ICIQ), 2001 [Paper]
2000
Approximate Tree Embedding for Querying XML Data Torsten Schlieder, Felix Naumann Proceedings of the ACM SIGIR Workshop on XML and Information Retrieval, 2000 [Paper]
Assessment Methods for Information Quality Criteria Felix Naumann, Claudia Rolker Proceedings of the International Conference on Information Quality (ICIQ), 2000 [Paper]
Completeness of Information Sources Felix Naumann, Johann-Christoph Freytag Humboldt-Universität zu Berlin, Institut für Informatik, 2000 [Paper]
Assessment Methods for Information Quality Criteria Felix Naumann, Claudia Rolker Humboldt-Universität zu Berlin, Institut für Informatik, 2000 [Paper]
Maximizing Coverage of Mediated Web Queries Ramana Yerneni, Felix Naumann, Hector Garcia-Molina Stanford University, CA, 2000 [Paper]
Quality-driven Query Planning Felix Naumann Proceedings of the EDBT PhD Workshop, 2000
Cooperative Query Answering with Density Scores Felix Naumann, Ulf Leser Proceedings of the International Conference on Management of Data (COMAD), 2000 [Paper]
Query Planning with Information Quality Bounds Ulf Leser, Felix Naumann Proceedings of the International Conference on Flexible Query Answering Systems (FQAS), 2000 [Paper]
1999
Quality-driven Integration of Heterogeneous Information Systems Felix Naumann, Ulf Leser, Johann Christoph Freytag Proceedings of International Conference on Very Large Data Bases (VLDB), 1999 [Paper]
Quality-driven Integration of Heterogeneous Information Systems Felix Naumann, Ulf Leser, Johann-Christoph Freytag Humboldt-Universität zu Berlin, Institut für Informatik, 1999 [Paper]
Density Scores for Cooperative Query Answering Felix Naumann, Ulf Leser Workshop on Föderierte Datenbanken (FDBMS), 1999 [Paper]
Do Metadata Models meet IQ Requirements? Felix Naumann, Claudia Rolker Proceedings of the International Conference on Information Quality (ICIQ), 1999 [Paper]
1998
Quality Driven Source Selection Using Data Envelopment Analysis Felix Naumann, Johann Christoph Freytag, Myra Spiliopoulou Proceedings of the International Conference on Information Quality (ICIQ), 1998 [Paper]
Data Fusion and Data Quality Felix Naumann Proceedings of the New Techniques & Technologies for Statistics Seminar (NTTS), 1998 [Paper]