Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Publications (sorted in inverse chronological order)

2020

  • Bornemann, L., Bleifuß, T., Kalashnikov, D.V., Naumann, F., Srivastava, D.: Natural Key Discovery in Wikipedia Tables.Proceedings of The World Wide Web Conference (WWW) (2020).
     
  • Risch, J., Krestel, R.: Toxic Comment Detection in Online Discussions. In: Agarwal, B., Nayak, R., Mittal, N., and Patnaik, S. (eds.) Deep Learning-Based Approaches for Sentiment Analysis. pp. 85-109. Springer (2020).
     
  • Repke, T., Krestel, R.: Exploration Interface for Jointly Visualised Text and Graph Data.25th International Conference on Intelligent User Interfaces Companion (IUI '20 Companion.2 (2020).
     
  • Koumarelas, I., Jiang, L., Naumann, F.: Data Preparation for Duplicate Detection.Journal of Data and Information Quality (JDIQ).1,1-24 (2020).
     
  • Hacker, P., Krestel, R., Grundmann, S., Naumann, F.: Explainable AI under Contract and Tort Law: Legal Incentives and Technical Challenges.Artificial Intelligence and Law. (2020).
     
  • Koumarelas, I., Papenbrock, T., Naumann, F.: MDedup: Duplicate Detection with Matching Dependencies.Proceedings of the VLDB Endowment (PVLDB).13, (2020).
     

2019

  • Bleifuß, T., Bornemann, L., Kalashnikov, D.V., Naumann, F., Srivastava, D.: DBChEx: Interactive Exploration of Data and Schema Change.Proceedings of the Conference on Innovative Data Systems Research (CIDR) (2019).
     
  • Kruse, S., Kaoudi, Z., Quiané-Ruiz, J.-A., Chawla, S., Naumann, F., Contreras-Rojas, B.: Optimizing Cross-Platform Data Movement.Proceedings of the International Conference on Data Engineering (ICDE). pp. 1642-1645 (2019).
     
  • Schirmer, P., Papenbrock, T., Kruse, S., Naumann, F., Hempfing, D., Mayer, T., Neuschäfer-Rube, D.: DynFD: Functional Dependency Discovery in Dynamic Datasets.Proceedings of the International Conference on Extending Database Technology (EDBT). p. 253--264 (2019).
     
  • Dürsch, F., Stebner, A., Windheuser, F., Fischer, M., Friedrich, T., Strelow, N., Bleifuß, T., Harmouch, H., Jiang, L., Papenbrock, T., Naumann, F.: Inclusion Dependency Discovery: An Experimental Evaluation of Thirteen Algorithms.Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 219–228 (2019).
     
  • Jiang, L., Vitagliano, G., Naumann, F.: A Scoring-based Approach for Data Preparator Suggestion.Lernen, Wissen, Daten, Analysen (LWDA) (2019).
     
  • Risch, J., Stoll, A., Ziegele, M., Krestel, R.: hpiDEDIS at GermEval 2019: Offensive Language Identification using a German BERT model.Proceedings of the 15th Conference on Natural Language Processing (KONVENS). p. 403--408. German Society for Computational Linguistics & Language Technology, Erlangen, Germany (2019).
     
  • Naumann, F.: The relational database management systems genealogy. In: Brodie, M.L. (ed.) Making Databases Work. pp. 173-179. ACM / Morgan & Claypool (2019).
     
  • Jiang, L., Naumann, F.: Holistic Primary Key and Foreign Key Detection.Journal of Intelligent Information Systems. (2019).
     
  • Risch, J., Krestel, R.: Measuring and Facilitating Data Repeatability in Web Science.Datenbank-Spektrum.19,117-126 (2019).
     
  • Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification.Data Technologies and Applications.53,108-122 (2019).
     
  • Pena, E.H.M., de Almeida, E.C., Naumann, F.: Discovery of Approximate (and Exact) Denial Constraints.PVLDB.13, (2019).
     
  • Jain, N., Krestel, R.: Who is Mona L.? Identifying Mentions of Artworks in Historical Archives.Springer.115--122 (2019).
     
  • Draisbach, U., Christen, P., Naumann, F.: Transforming Pairwise Duplicates to Entity Clusters for High Quality Duplicate Detection.ACM Journal on Data and Information Quality (JDIQ).12, (2019).
     
  • Kellermeier, T., Repke, T., Krestel, R.: Mining Business Relationships from Stocks and News.MIDAS@ECML-PKDD. (2019).
     

2018

  • Risch, J., Krestel, R.: Delete or not Delete? Semi-Automatic Comment Moderation for the Newsroom.Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 166-176 (2018).
     
  • Berti-Equille, L., Harmouch, H., Naumann, F., Novelli, N., Thirumuruganathan, S.: Discovery of Genuine Functional Dependencies from Relational Data with Missing Values.Proceedings of the VLDB Endowment (PVLDB). pp. 880-892 (2018).
     
  • van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for Toxic Comment Classification: An In-Depth Error Analysis.Proceedings of the 2nd Workshop on Abusive Language Online (co-located with EMNLP). pp. 33-42 (2018).
     
  • Risch, J., Krestel, R.: Learning Patent Speak: Investigating Domain-Specific Word Embeddings.Proceedings of the Thirteenth International Conference on Digital Information Management (ICDIM). pp. 63-68 (2018).
     
  • Risch, J., Krebs, E., Löser, A., Riese, A., Krestel, R.: Fine-Grained Classification of Offensive Language.Proceedings of GermEval (co-located with KONVENS). pp. 38-44 (2018).
     
  • Risch, J., Garda, S., Krestel, R.: Book Recommendation Beyond the Usual Suspects: Embedding Book Plots Together with Place and Time Information.Proceedings of the 20th International Conference On Asia-Pacific Digital Libraries (ICADL). pp. 227-239 (2018).
     
  • Pietrangelo, A., Simonini, G., Bergamaschi, S., Naumann, F., Koumarelas, I.: Towards Progressive Search-driven Entity Resolution.Italian Symposium on Advanced Database Systems (SEBD) (2018).
     
  • Loster, M., Repke, T., Krestel, R., Naumann, F., Ehmueller, J., Feldmann, B., Maspfuhl, O.: The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities.Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling (DSMM 2018). ACM (2018).
     
  • Loster, M., Naumann, F., Ehmueller, J., Feldmann, B.: CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text.Proceedings of the ACM International Conference on Information and Knowledge Management. pp. 1883-1886. ACM (2018).
     
  • Loster, M., Hegner, M., Naumann, F., Leser, U.: Dissecting Company Names using Sequence Labeling.Proceedings of the Conference "Lernen, Wissen, Daten, Analysen". pp. 227-238 (2018).
     
  • Repke, T., Krestel, R., Edding, J., Hartmann, M., Hering, J., Kipping, D., Schmidt, H., Scordialo, N., Zenner, A.: Beacon in the Dark: A System for Interactive Exploration of Large Email Corpora.Proceedings of the International Conference on Information and Knowledge Management (CIKM). p. 1--4. ACM (2018).
     
  • Risch, J., Krestel, R.: Aggression Identification Using Deep Learning and Data Augmentation.Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 150-158 (2018).
     
  • Exeler, C., Graber, M., Junge, T., Ramson, S., Ramson, C., Tschirschnitz, F., Naumann, F.: Piggyback Profiling: Enhancing Query Results with Metadata.Lernen. Wissen. Daten. Analysen. (LWDA) (2018).
     
  • Repke, T., Krestel, R.: Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks.40th European Conference on Information Retrieval (ECIR 2018). Springer, Grenoble, France (2018).
     
  • Risch, J., Krestel, R.: My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections.Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL). pp. 283-292 (2018).
     
  • Repke, T., Krestel, R.: Topic-aware Network Visualisation to Explore Large Email Corpora.International Workshop on Big Data Visual Exploration and Analytics (BigVis). CEUR-WS.org (2018).
     
  • Lazaridou, K., Gruetze, T., Naumann, F.: Where in the World Is Carmen Sandiego? Detecting Person Locations via Social Media Discussions.Proceedings of the ACM Conference on Web Science. ACM (2018).
     
  • Ambroselli, C., Risch, J., Krestel, R., Loos, A.: Prediction for the Newsroom: Which Articles Will Get the Most Comments?Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). pp. 193-199. ACL, New Orleans, Louisiana, USA (2018).
     
  • Bunk, S., Krestel, R.: WELDA: Enhancing Topic Models by Incorporating Local Word Contexts.Joint Conference on Digital Libraries (JCDL 2018). ACM, Forth Worth, Texas, USA (2018).
     
  • Abedjan, Z., Golab, L., Naumann, F., Papenbrock, T.: Data Profiling.Morgan & Claypool Publishers (2018).
     
  • Bornemann, L., Bleifuß, T., Kalashnikov, D., Naumann, F., Srivastava, D.: Data Change Exploration using Time Series Clustering.Datenbank-Spektrum.18,1-9 (2018).
     
  • Koumarelas, I., Kroschk, A., Mosley, C., Naumann, F.: Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection.Journal of Data and Information Quality (JDIQ).10,8:1--8:16 (2018).
     
  • Sadiq, S., Dasu, T., Dong, X.L., Freire, J., Ilyas, I.F., Link, S., Miller, R.J., Naumann, F., Zhou, X., Srivastava, D.: Data Quality – The Role of Empiricism.SIGMOD Record.46,35-43 (2018).
     
  • Kruse, S., Naumann, F.: Efficient Discovery of Approximate Dependencies.Proceedings of the VLDB Endowment.11,759-772 (2018).
    See abstract for errata
     
  • Bleifuß, T., Bornemann, L., Johnson, T., Kalashnikov, D.V., Naumann, F., Srivastava, D.: Exploring Change - A New Dimension of Data Analytics.Proceedings of the VLDB Endowment (PVLDB).12,85-98 (2018).
     
  • Agrawal, D., Chawla, S., Kaoudi, Z., Kruse, S., Quiané-Ruiz, J.A., Contreras-Rojas, B., Elmagarmid, A., Idris, Y., Lucas, J., Mansour, E., Ouzzani, M., Papotti, P., Tang, N., Thirumuruganathan, S., Troudi, A.: RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! -.Proceedings of the VLDB Endowment (PVLDB).11, (2018).
     

2017

  • Kruse, S., Hahn, D., Walter, M., Naumann, F.: Metacrate: Organize and Analyze Millions of Data Profiles.Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 2483-2486. ACM (2017).
     
  • Papenbrock, T., Naumann, F.: A Hybrid Approach for Efficient Unique Column Combination Discovery.Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 195-204 (2017).
     
  • Lazaridou, K., Krestel, R., Naumann, F.: Identifying Media Bias by Analyzing Reported Speech.International Conference on Data Mining. IEEE (2017).
     
  • Repke, T., Loster, M., Krestel, R.: Comparing Features for Ranking Relationships Between Financial Entities Based on Text.Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets. p. 12:1--12:2. ACM, New York, NY, USA (2017).
     
  • Zuo, Z., Loster, M., Krestel, R., Naumann, F.: Uncovering Business Relationships: Context-sensitive Relationship Extraction for Difficult Relationship Types.Proceedings of the Conference "Lernen, Wissen, Daten, Analysen" (LWDA) (2017).
     
  • Harmouch, H., Naumann, F.: Cardinality Estimation: An Experimental Survey.Proceedings of the VLDB Endowment (PVLDB). pp. 499 - 512 (2017).
     
  • Abedjan, Z., Golab, L., Naumann, F.: Data Profiling (tutorial).Proceedings of the International Conference on Management of Data (SIGMOD) (2017).
     
  • Krestel, R., Risch, J.: How Do Search Engines Work? A Massive Open Online Course with 4000 Participants.Proceedings of the Conference Lernen, Wissen, Daten, Analysen. pp. 259-271 (2017).
     
  • Papenbrock, T., Naumann, F.: Data-driven Schema Normalization.Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 342-353 (2017).
     
  • Loster, M., Zuo, Z., Naumann, F., Maspfuhl, O., Thomas, D.: Improving Company Recognition from Unstructured Text by using Dictionaries.Proceedings of the International Conference on Extending Database Technology. pp. 610-619 (2017).
     
  • Kruse, S., Papenbrock, T., Dullweber, C., Finke, M., Hegner, M., Zabel, M., Zöllner, C., Naumann, F.: Fast Approximate Discovery of Inclusion Dependencies.Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 207-226 (2017).
     
  • Gruetze, T., Krestel, R., Lazaridou, K., Naumann, F.: What was Hillary Clinton doing in Katy, Texas?Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, 3-7 April, 2017. ACM (2017).
     
  • Risch, J., Krestel, R.: What Should I Cite? Cross-Collection Reference Recommendation of Patents and Papers.Proceedings of the International Conference on Theory and Practice of Digital Libraries (TPDL). pp. 40-46 (2017).
     
  • Bleifuß, T., Johnson, T., Kalashnikov, D.V., Naumann, F., Shkapenyuk, V., Srivastava, D.: Enabling Change Exploration (Vision).Proceedings of the Fourth International Workshop on Exploratory Search in Databases and the Web (ExploreDB). pp. 1-3 (2017).
     
  • Maschler, F., Niephaus, F., Risch, J.: Real or Fake? Large-Scale Validation of Identity Leaks.47. Jahrestagung der Gesellschaft für Informatik (INFORMATIK). pp. 2437-2448 (2017).
     
  • Giesler, M.J., Keller, B., Repke, T., Leonhart, R., Weis, J., Muckelbauer, R., Rieckmann, N., Müller-Nordhorn, J., Lucius-Hoene, G., Holmberg, C.: Effect of a Website That Presents Patients' Experiences on Self-Efficacy and Patient Competence of Colorectal Cancer Patients: Web-Based Randomized Controlled Trial.J Med Internet Res.19,e334 (2017).
     
  • Tschirschnitz, F., Papenbrock, T., Naumann, F.: Detecting Inclusion Dependencies on Very Many Tables.ACM Transactions on Database Systems (TODS).42,18:1-18:29 (2017).
     
  • Bleifuß, T., Kruse, S., Naumann, F.: Efficient Denial Constraint Discovery with Hydra.Proceedings of the VLDB Endowment (PVLDB).11,311-323 (2017).
     
  • Heller, D., Krestel, R., Ohler, U., Vingron, M., Marsico, A.: ssHMM: Extracting Intuitive Sequence-Structure Motifs from High-Throughput RNA-Binding Protein Data.Nucleic Acid Research.45,11004--11018 (2017).
     
  • Naumann, F., Krestel, R.: Das Fachgebiet „Informationssysteme“ am Hasso-Plattner-Institut.Datenbankspektrum.17,69-76 (2017).
     

2016

  • Krestel, R., Mottin, D., Müller, E. eds: Proceedings of the Conference "Lernen, Wissen, Daten, Analysen", Potsdam, Germany, September 12-14, 2016.CEUR-WS.org (2016).
     
  • Samiei, A., Koumarelas, I., Loster, M., Naumann, F.: Combination of Rule-based and Textual Similarity Approaches to Match Financial Entities.Data Science for Macro-Modeling with Financial and Economic Datasets (DSMM). ACM (2016).
     
  • Agrawal, D., Ba, L., Berti-Equille, L., Chawla, S., Elmagarmid, A., Hammady, H., Idris, Y., Kaoudi, Z., Khayyat, Z., Kruse, S., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Zaki, M.J.: Rheem: Enabling Multi-Platform Task Execution (demo).Proceedings of the ACM SIGMOD conference (SIGMOD) (2016).
     
  • Gruetze, T., Krestel, R., Naumann, F.: Topic Shifts in StackOverflow: Ask it like Socrates.Lecture Notes in Computer Science. p. 213--221. Springer (2016).
     
  • Samiei, A., Naumann, F.: Cluster-based Sorted Neighborhood for Efficient Duplicate Detection.International Conference on Data Mining Workshops (ICDMW) (2016).
     
  • Papenbrock, T., Naumann, F.: A Hybrid Approach to Functional Dependency Discovery.Proceedings of the International Conference on Management of Data (SIGMOD). pp. 821-833. ACM, New York, NY, USA (2016).
     
  • Bleifuß, T., Bülow, S., Frohnhofen, J., Risch, J., Wiese, G., Kruse, S., Papenbrock, T., Naumann, F.: Approximate Discovery of Functional Dependencies for Large Datasets.Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 1803-1812. ACM, New York, NY, USA (2016).
     
  • Godde, C., Lazaridou, K., Krestel, R.: Classification of German Newspaper Comments.Proceedings of the Conference Lernen, Wissen, Daten, Analysen. pp. 299-310. CEUR-WS.org (2016).
     
  • Ehrlich, J., Roick, M., Schulze, L., Zwiener, J., Papenbrock, T., Naumann, F.: Holistic Data Profiling: Simultaneous Discovery of Various Metadata.Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 305-316. OpenProceedings.org (2016).
     
  • Kruse, S., Jentzsch, A., Papenbrock, T., Kaoudi, Z., Quiane-Ruiz, J.-A., Naumann, F.: RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets.Proceedings of the International Conference on Management of Data (SIGMOD). pp. 953-967. ACM, New York, NY, USA (2016).
     
  • Jenders, M., Krestel, R., Naumann, F.: Which Answer is Best? Predicting Accepted Answers in MOOC Forums.Proceedings of the 25th International Conference Companion on World Wide Web. pp. 679-684. International World Wide Web Conferences Steering Committee (2016).
     
  • Park, J., Blume-Kohout, M., Krestel, R., Nalisnick, E., Smyth, P.: Analyzing NIH Funding Patterns over Time with Statistical Text Analysis.Scholarly Big Data: AI Perspectives, Challenges, and Ideas (SBD 2016) Workshop at AAAI 2016. AAAI (2016).
     
  • Grundke, M., Jasper, J., Perchyk, M., Sachse, J.P., Krestel, R., Neves, M.: TextAI: Enhancing TextAE with Intelligent Annotation Support.Proceedings of the 7th International Symposium on Semantic Mining in Biomedicine (SMBM 2016). pp. 80-84. CEUR-WS.org (2016).
     
  • Ziawasch Abedjan, L.G., Naumann, F.: Data Profiling (tutorial).International Conference on Data Engineering (ICDE) (2016).
     
  • Lazaridou, K., Krestel, R.: Identifying Political Bias in News Articles.International Conference on Theory and Practice of Digital Libraries. IEEE Technical Committee on Digital Libraries (2016).
    TPDL Doctoral Consortium
     
  • Kruse, S., Papenbrock, T., Harmouch, H., Naumann, F.: Data Anamnesis: Admitting Raw Data into an Organization.IEEE Data Engineering Bulletin.39,8-20 (2016).
     
  • Gruetze, T., Kasneci, G., Zuo, Z., Naumann, F.: CohEEL: Coherent and Efficient Named Entity Linking through Random Walks.Web Semantics: Science, Services and Agents on the World Wide Web.37,75--89 (2016).
     
  • Naumann, F., Krestel, R.: The Information Systems Group at HPI.SIGMOD Record. (2016).
     
  • Langer, P., Naumann, F.: Efficient Order Dependency Discovery.VLDB Journal.25,223-241 (2016).
     

2015

  • Krestel, R., Werkmeister, T., Wiradarma, T.P., Kasneci, G.: Tweet-Recommender: Finding Relevant Tweets for News Articles.Proceedings of the 24th International World Wide Web Conference (WWW). ACM (2015).
     
  • Gruetze, T., Yao, G., Krestel, R.: Learning Temporal Tagging Behaviour.Proceedings of the 24th International Conference on World Wide Web Companion (WWW). p. 1333--1338. ACM (2015).
     
  • Hennig, P., Berger, P., Dullweber, C., Finke, M., Maschler, F., Risch, J., Meinel, C.: Social Media Story Telling.Proceedings of the 8th IEEE International Conference on Social Computing and Networking (SocialCom2015). pp. 279-284. , Chengdu, China (2015).
     
  • Schmidt, D., Frohnhofen, J., Knebel, S., Meinel, F., Perchyk, M., Risch, J., Striebel, J., Wachtel, J., Baudisch, P.: Ergonomic Interaction for Touch Floors.Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. pp. 3879-3888. ACM, Seoul, Republic of Korea (2015).
     
  • Roick, M., Jenders, M., Krestel, R.: How to Stay Up-to-date on Twitter with General Keywords.Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB. CEUR-WS.org (2015).
     
  • Kruse, S., Papotti, P., Naumann, F.: Estimating Data Integration and Cleaning Effort.Proceedings of the International Conference on Extending Database Technology (EDBT) (2015).
     
  • Kruse, S., Papenbrock, T., Naumann, F.: Scaling Out the Discovery of Inclusion Dependencies.Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 445-454 (2015).
     
  • Jenders, M., Lindhauer, T., Kasneci, G., Krestel, R., Naumann, F.: A Serendipity Model For News Recommendation.KI 2015: Advances in Artificial Intelligence - 38th Annual German Conference on AI, Dresden, Germany, September 21-25, 2015, Proceedings. pp. 111-123. Springer (2015).
     
  • Jentzsch, A., Dullweber, C., Troiano, P., Naumann, F.: Exploring Linked Data Graph Structures.In Proceedings of Posters and Demos Session, ISWC2015. , Bethlehem, PA, USA (2015).
     
  • Schubotz, T., Krestel, R.: Online Temporal Summarization of News Events.Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). pp. 679-684. IEEE Computer Society (2015).
     
  • Jentzsch, A., Mühleisen, H., Naumann, F.: Uniqueness, Density, and Keyness: Exploring Class Hierarchies.In Proceedings of 6th International Workshop on Consuming Linked Data (COLD 2015), ISWC 2015. , Bethlehem, PA, USA (2015).
     
  • Papenbrock, T., Heise, A., Naumann, F.: Progressive Duplicate Detection.IEEE Transactions on Knowledge and Data Engineering (TKDE).27,1316-1329 (2015).
     
  • Abedjan, Z., Golab, L., Naumann, F.: Profiling relational data: a survey.VLDB Journal.24,557-581 (2015).
     
  • Papenbrock, T., Bergmann, T., Finke, M., Zwiener, J., Naumann, F.: Data Profiling with Metanome.Proceedings of the VLDB Endowment.8,1860-1871 (2015).
     
  • Krestel, R., Dokoohaki, N.: Diversifying Customer Review Rankings.Neural Networks.66,36-45 (2015).
     
  • Papenbrock, T., Kruse, S., Quiane-Ruiz, J.-A., Naumann, F.: Divide & Conquer-based Inclusion Dependency Discovery.Proceedings of the VLDB Endowmen.8,774-785 (2015).
     
  • Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: An Extensible Logical Optimizer for UDF-heavy Data Flows.Information Systems.52,96-125 (2015).
     
  • Papenbrock, T., Ehrlich, J., Marten, J., Neubert, T., Rudolph, J.-P., Schönberg, M., Zwiener, J., Naumann, F.: Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms.Proceedings of the VLDB Endowment.8,1082-1093 (2015).
     

2014

  • Forchhammer, B., Jentzsch, A., Naumann, F.: LODOP - Multi-Query Optimization for Linked Data Profiling Queries.In Proceedings of the International Workshop on Dataset PROFIling & fEderated Search for Linked Data (PROFILES) in conjunction with ESWC. , Heraklion, Greece (2014).
    Selected for Best Workshop Paper Award.
     
  • Gruetze, T., Kasneci, G., Zuo, Z., Naumann, F.: Bootstrapping Wikipedia to Answer Ambiguous Person Name Queries.10th International Workshop on Information Integration on the Web (IIWeb). , Chicago, IL (2014).
     
  • Heise, A., Kasneci, G., Naumann, F.: Estimating the Number and Sizes of Fuzzy-Duplicate Clusters.Proceedings of the Conference on Information and Knowledge Management (CIKM). pp. 959-968 (2014).
     
  • Abedjan, Z., Gruetze, T., Jentzsch, A., Naumann, F.: Profiling and Mining RDF Data with ProLOD++.Proceedings of the IEEE International Conference on Data Engineering (ICDE), Demo. , Chicago, IL (2014).
     
  • Vogel, T., Naumann, F.: Semi-Supervised Consensus Clustering: Reducing Human Effort.Proceedings of the International Workshop on Data Integration and Applications (2014).
     
  • Meyer, A., Pufahl, L., Batoulis, K., Kruse, S., Lindhauer, T., Stoff, T., Fahland, D., Weske, M.: Data Perspective in Process Choreographies: Modeling and Execution.26th International Conference on Advanced Information Systems Engineering. , Thessaloniki, Greece (2014).
     
  • Abedjan, Z., Naumann, F.: Amending RDF Entities with New Facts.Know@LOD Workshop in conjunction with ESWC. , Creete, Greece (2014).
    Selected for Best Workshop Paper Award.
     
  • Abedjan, Z., Schulze, P., Naumann, F.: DFD: Efficient Discovery of Functional Dependencies.In Proceedings of the International Conference on Information and Knowledge Management (CIKM), Shanghai, China. pp. 949-958 (2014).
     
  • Abedjan, Z., Quanie-Ruiz, J.-A., Naumann, F.: Detecting Unique Column Combinations on Dynamic Data.Proceedings of the IEEE International Conference on Data Engineering (ICDE). , Chicago, IL (2014).
     
  • Rheinländer, A., Beckmann, M., Kunkel, A., Heise, A., Stoltmann, T., Leser, U.: Versatile optimization of UDF-heavy data flows with SOFA (demo).Proceedings of the SIGMOD conference. pp. 685-688 (2014).
     
  • Langer, P., Schulze, P., George, S., Kohnen, M., Metzke, T., Abedjan, Z., Kasneci, G.: Assigning Global Relevance Scores to DBpedia Facts.International Workshop on Data Engineering meets the Semantic Web (DESWeb). , Chicago, IL (2014).
     
  • Zuo, Z., Kasneci, G., Gruetze, T., Naumann, F.: BEL: Bagging for Entity Linking.25th International Conference on Computational Linguistics (COLING). , Dublin, Ireland (2014).
     
  • Krestel, R., Bergler, S., Witte, R.: Modeling human newspaper readers: The Fuzzy Believer approach.Natural Language Engineering.20,261--288 (2014).
     
  • Vogel, T., Heise, A., Draisbach, U., Lange, D., Naumann, F.: Reach for Gold: An Annealing Standard to Evaluate Duplicate Detection Results.JDIQ.5, (2014).
     
  • Lorey, J.: Identifying and Determining SPARQL Endpoint Characteristics.International Journal of Web Information Systems.10, (2014).
     
  • Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J.-C., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The Stratosphere Platform for Big Data Analytics.The VLDB Journal.23,939-964 (2014).
     
  • Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia.Semantic Web Journal. (2014).
    Selected for 2014 Semantic Web journal outstanding paper award.
     

2013

  • Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: An Extensible Logical Optimizer for UDF-heavy Dataflows. (2013).
     
  • Lorey, J., Naumann, F.: Caching and Prefetching Strategies for SPARQL Queries.Proceedings of the 3rd International Workshop on Usage Analysis and the Web of Data (USEWOD). , Montpellier, France (2013).
    Selected as Best Workshop Paper for publication in ESWC post-proceedings
     
  • Lacoste-Julien, S., Palla, K., Davies, A., Kasneci, G., Graepel, T., Ghahramani, Z.: SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases.Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2013).
     
  • Lorey, J., Naumann, F.: Detecting SPARQL Query Templates for Data Prefetching.Proceedings of the 10th Extended Semantic Web Conference (ESWC). , Montpellier, France (2013).
     
  • Forchhammer, B., Papenbrock, T., Stening, T., Viehmeier, S., Draisbach, U., Naumann, F.: Duplicate Detection on GPUs.Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 165-184 (2013).
    Runner Up for Best Paper Award
     
  • Heise, A., Quiane-Ruiz, J.-A., Abedjan, Z., Jentzsch, A., Naumann, F.: Scalable Discovery of Unique Column Combinations.Proceedings of the VLDB Endowment (PVLDB). , Hangzhou, China (2013).
    Jorge's presentation at VLDB 2014 was awarded the "Excellent Presentation Award".
     
  • Lorey, J.: SPARQL Endpoint Metrics for Quality-Aware Linked Data Consumption.Proceedings of the 15th International Conference on Information Integration and Web-based Applications & Services (iiWAS '13). , Vienna, Austria (2013).
     
  • Lorey, J.: Storing and Provisioning Linked Data as a Service.Proceedings of the 10th Extended Semantic Web Conference (ESWC). , Montpellier, France (2013).
     
  • Lange, D., Naumann, F.: Bulk Sorted Access for Efficient Top-k Retrieval.Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM). , Baltimore, Maryland (2013).
     
  • Jenders, M., Kasneci, G., Naumann, F.: Analyzing and Predicting Viral Tweets.Proceedings of the WWW '13 Companion: 22nd International World Wide Web Conference. , Rio de Janeiro, Brazil (2013).
     
  • Leich, M., Adamek, J., Schubotz, M., Heise, A., Rheinlander, A., Markl, V.: Applying Stratosphere for Big Data Analytics.Database Systems for Business, Technology, and Web (BTW) (2013).
     
  • Lorey, J., Naumann, F.: Caching and Prefetching Strategies for SPARQL Queries.ESWC 2013 Satellite Events -- Revised Selected Papers. , Montpellier, France (2013).
     
  • Draisbach, U., Naumann, F.: On Choosing Thresholds for Duplicate Detection.Proceedings of the 18th International Conference on Information Quality (ICIQ). , Little Rock, USA (2013).
     
  • Albrecht, A., Naumann, F.: Systematic ETL Management – Experiences with High-Level Operators.Proceedings of the 18th International Conference on Information Quality (ICIQ). , Little Rock, AK (2013).
     
  • Abedjan, Z., Naumann, F.: Synonym Analysis for Predicate Expansion.Proceedings of the Extended Semantic Web Conference (ESWC), Montpellier, France (2013).
     
  • Lange, D., Naumann, F.: Cost-Aware Query Planning for Similarity Search.Information Systems (IS).38,455--469 (2013).
     
  • Rinser, D., Lange, D., Naumann, F.: Cross-lingual Entity Matching and Infobox Alignment in Wikipedia.Information Systems (IS).38,887–907 (2013).
     
  • Naumann, F., Jenders, M., Papenbrock, T.: Ein Datenbankkurs mit 6000 Teilnehmern - Erfahrungen auf der openHPI MOOC Plattform.Informatik-Spektrum.37,333-340 (2013).
     
  • Naumann, F.: Data Profiling Revisited.SIGMOD Record.32,40-49 (2013).
     
  • Abedjan, Z., Naumann, F.: Improving RDF Data through Association Rule Mining.Datenbank-Spektrum (Special Issue on RDF Data Management).13,111--120 (2013).
     
  • Momtazi, S., Naumann, F.: Topic modeling for expert finding using latent dirichlet allocation.WIREs Data Mining and Knowledge Discovery.3,346–353 (2013).
     

2012

  • Draisbach, U., Naumann, F.: Adaptive Windows for Duplicate Detection.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-143-1, ISSN 1613-5652
     
  • Bauckmann, J., Abedjan, Z., Leser, U., Müller, H., Naumann, F.: Covering or complete? : discovering conditional inclusion dependencies.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-212-4, ISSN 1613-5652
     
  • Albrecht, A., Naumann, F.: Understanding Cryptic Schemata in Large Extract-Transform-Load Systems.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-201-8, ISSN 1613-5652
     
  • Böhm, C., Freitag, M., Heise, A., Lehmann, C., Mascher, A., Naumann, F., Hernandez, M., Ercegovac, V., Haase, P.: GovWILD: Integrating Open Government Data for Transparency (demo).Proceedings of the International World Wide Web Conference (WWW). , Lyon, France (2012).
     
  • Böhm, C., de Melo, G., Naumann, F., Weikum, G.: LINDA: Distributed Web-of-Data-Scale Entity Matching.Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii (2012).
     
  • Vogel, T., Naumann, F.: Automatic Blocking Key Selection for Duplicate Detection based on Unigram Combinations.Proceedings of the 10th International Workshop on Quality in Databases (QDB) in conjunction with VLDB (2012).
     
  • Böhm, C., Kasneci, G., Naumann, F.: Latent Topics in Graph-Structured Data.Proceedings of the Conference on Information and Knowledge Management (CIKM) (2012).
     
  • Momtazi, S.: Fine-grained German Sentiment Analysis on Social Media.Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC). , Istanbul, Turkey (2012).
     
  • Draisbach, U., Naumann, F., Szott, S., Wonneberg, O.: Adaptive Windows for Duplicate Detection.Proceedings of the 28th International Conference on Data Engineering (ICDE). , Washington, D.C., USA (2012).
     
  • Heise, A., Rheinländer, A., Leich, M., Leser, U., Naumann, F.: Meteor/Sopremo: An Extensible Query Language and Operator Model.Proceedings of the International Workshop on End-to-end Management of Big Data (BigData) in conjunction with VLDB 2012. , Istanbul, Turkey (2012).
     
  • Tafaj, E., Kasneci, G., Rosenstiel, W., Bogdan, M.: Bayesian online clustering of eye movement data.Proceedings of the 2012 Symposium on Eye-Tracking Research and Applications. pp. 285-288. ACM (2012).
     
  • Abedjan, Z., Lorey, J., Naumann, F.: Reconciling Ontologies and the Web of Data.Proceedings of the 21st International Conference on Information and Knowledge Management (CIKM). pp. 1532-1536. , Maui, Hawaii, USA (2012).
     
  • Albrecht, A., Naumann, F.: Schema Decryption for Large Extract-Transform-Load Systems.Proceedings of the 31st International Conference on Conceptual Modeling (ER 2012). , Florence, Italy (2012).
     
  • Bauckmann, J., Abedjan, Z., Müller, H., Leser, U., Naumann, F.: Discovering Conditional Inclusion Dependencies.Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii. pp. 2094-2098 (2012).
     
  • Fenz, D., Lange, D., Rheinländer, A., Naumann, F., Leser, U.: Efficient Similarity Search in Very Large String Sets.Proceedings of the International Conference on Scientific and Statistical DatabaseManagement (SSDBM). , Chania, Crete, Greece (2012).
     
  • Kasneci, G.: Reasoning about Knowledge from the Web - (Extended Abstract).ICWE Workshops. pp. 186-188. Springer (2012).
     
  • Gruetze, T., Böhm, C., Naumann, F.: Holistic and Scalable Ontology Alignment for Linked Open Data.Proceedings of the 5th Linked Data on the Web (LDOW) Workshop at the 21th International World Wide Web Conference (WWW). , Lyon, France (2012).
     
  • Böhm, C., Hefenbrock, D., Naumann, F.: Scalable Peer-to-Peer-based RDF Management.Proceedings of the 8th Int. Conference on Semantic Systems. , Graz, Austria (2012).
     
  • Köppelmann, M., Lange, D., Lehmann, C., Marszalkowski, M., Naumann, F., Retzlaff, P., Stange, S., Voget, L.: Scalable Similarity Search with Dynamic Similarity Measures.Proceedings of the 6th International Workshop on Ranking in Databases (DBRank) in conjunction with VLDB. , Istanbul, Turkey (2012).
     
  • Draisbach, U.: Partitionierung zur effizienten Duplikaterkennung in relationalen Daten.Springer Vieweg (2012).
     
  • Herschel, M., Naumann, F., Szott, S., Taubert, M.: Scalable Iterative Graph Duplicate Detection.Transactions on Knowledge and Data Engineering (TKDE).24,2094-2108 (2012).
     
  • Beskales, G., Das, G., Elmagarmid, A.K., Ilyas, I.F., Naumann, F., Ouzzani, M., Papotti, P., Quiane-Ruiz, J., Tang, N.: The Data Analytics Group at the Qatar Computing Research Institute.SIGMOD Record.41, (2012).
     
  • Heise, A., Naumann, F.: Integrating Open Government Data with Stratosphere for more Transparency.Web Semantics: Science, Services and Agents on the World Wide Web.14,45 - 56 (2012).
     
  • Abelló, A., Darmont, J., Etcheverry, L., Golfarelli, M., Mazón, J.-N., Naumann, F., Pedersen, T.B., Rizzi, S., Trujillo, J., Vassiliadis, P., Vossen, G.: Fusion Cubes: Towards Self-Service Business Intelligence.International Journal of Data Warehousing and Mining (IJDWM).9,66-88 (2012).
     

2011

  • Abedjan, Z., Naumann, F.: Advancing the Discovery of Unique Column Combinations.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2011).
    ISBN 978-3-86956-148-6, ISSN 1613-5652
     
  • Lange, D., Naumann, F.: Efficient Similarity Search: Arbitrary Similarity Measures, Arbitrary Composition.Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM). p. 1679--1688. , Glasgow, Scotland, UK (2011).
     
  • Lange, D., Naumann, F.: Frequency-aware Similarity Measures.Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM). p. 243--248. , Glasgow, Scotland, UK (2011).
     
  • Böhm, C., Kny, E., Emde, B., Abedjan, Z., Naumann, F.: SPRINT: ranking search results by paths.International Conference on Extending Database Technology (EDBT), Uppsala, Sweden. pp. 546-549 (2011).
     
  • Lorey, J., Naumann, F., Forchhammer, B., Mascher, A., Retzlaff, P., ZamaniFarahani, A., Discher, S., Faehnrich, C., Lemme, S., Papenbrock, T., Peschel, R.C., Richter, S., Stening, T., Viehmeier, S.: Black Swan: Augmenting Statistics with Event Data.Proceedings of the 20th Conference on Information and Knowledge Management (CIKM). pp. 2517-2520. , Glasgow, UK (2011).
     
  • Abedjan, Z., Naumann, F.: Context and Target Configurations for Mining RDF Data.International Workshop on Search & Mining Entity-Relationship Data (SMER), Glasgow, UK (2011).
     
  • Lorey, J., Abedjan, Z., Naumann, F., Böhm, C.: RDF Ontology (Re-)Engineering through Large-scale Data Mining.Billion Triples Challenge (BTC) at the 10th International Semantic Web Conference (ISWC). , Koblenz, Germany (2011).
    Finalist
     
  • Vogel, T., Naumann, F.: Instance-based "one-to-some" Assignment of Similarity Measures to Attributes.Proceedings of the 19th International Conference on Cooperative Information Systems (CoopIS) (2011).
     
  • Draisbach, U., Naumann, F.: A Generalization of Blocking and Windowing Algorithms for Duplicate Detection.Proceedings of the International Conference on Data and Knowledge Engineering (ICDKE). , Milan, Italy (2011).
     
  • AbuJarour, M., Naumann, F.: Improving Service Discovery through Enriched Service Descriptions.Datenbanksysteme für Business, Technologie und Web (BTW), 14. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2.-4.3.2011 in Kaiserslautern, Germany. pp. 706-709 (2011).
     
  • Abedjan, Z., Naumann, F.: Advancing the Discovery of Unique Column Combinations.Proceedings of the International Conference on Information and Knowledge Management (CIKM), Glasgow, UK (2011).
     
  • Bleiholder, J., Naumann, F.: Kurz erklärt: Datenfusion.Datenbank-Spektrum.11,59-61 (2011).
     
  • Bleiholder, J., Herschel, M., Naumann, F.: Eliminating NULLs with Subsumption and Complementation.IEEE Data Engineering Bulletin.34,18-25 (2011).
     
  • Lange, D., Vogel, T., Draisbach, U., Naumann, F.: Projektseminar "Similarity Search Algorithms".Datenbank-Spektrum.11,51-57 (2011).
     
  • Böhm, C., Lorey, J., Naumann, F.: Creating voiD Descriptions for Web-scale Data.Journal of Web Semantics: Science, Services and Agents on the World Wide Web.9,339-345 (2011).
     

2010

  • Bauckmann, J., Leser, U., Naumann, F.: Efficient and Exact Computation of Inclusion Dependencies for Data Integration.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2010).
    ISBN 978-3-86956-048-9, ISSN 1613-5652
     
  • Lange, D., Böhm, C., Naumann, F.: Extracting structured information from Wikipedia articles to populate infoboxes.Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2010).
    ISBN 978-3-86956-081-6, ISSN 1613-5652
     
  • Dong, X., Naumann, F. eds: Proceedings of the 13th International Workshop on the Web and Databases (WebDB), Indianapolis, IN. (2010).
     
  • Manolescu, I., Spaccapietra, S., Teubner, J., Kitsuregawa, M., Léger, A., Naumann, F., Ailamaki, A., Özcan, F. eds: Proceedings of the 13th International Conference on Extending Database Technology, Lausanne, Switzerland.ACM (2010).
     
  • Bleiholder, J., Szott, S., Herschel, M., Kaufer, F., Naumann, F.: Subsumption and complementation as data fusion operators.Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 513-524. , Lausanne, Switzerland (2010).
     
  • Vogel, T.: Self-Adaptive Data Quality Web Services.Grundlagen von Datenbanken. , Bad Helmstedt (2010).
     
  • Lange, D., Böhm, C., Naumann, F.: Extracting structured information from Wikipedia articles to populate infoboxes.Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM). pp. 1661-1664. , Toronto, Canada (2010).
     
  • Böhm, C., Naumann, F., Abedjan, Z., Fenz, D., Gruetze, T., Hefenbrock, D., Pohl, M., Sonnabend, D.: Profiling linked open data with ProLOD.Workshops Proceedings of the 26th International Conference on Data Engineering (ICDE), Long Beach, CA. pp. 175-178 (2010).
     
  • Bleiholder, J., Szott, S., Herschel, M., Naumann, F.: Complement union for data integration.Proceedings of the International Conference on Data Engineering Workshops, (ICDE NTII workshop). pp. 183-186. , Long Beach, CA (2010).
     
  • Draisbach, U., Naumann, F.: DuDe: The Duplicate Detection Toolkit.Proceedings of the International Workshop on Quality in Databases (QDB). , Singapore (2010).
     
  • Brauer, F., Huber, M., Hackenbroich, G., Leser, U., Naumann, F., Barczynski, W.M.: Graph-based concept identification and disambiguation for enterprise search.Proceedings of the 19th International Conference on World Wide Web (WWW). pp. 171-180. , Raleigh, NC (2010).
     
  • AbuJarour, M., Naumann, F.: Towards a diamond SOA operational model.IEEE International Conference on Service-Oriented Computing and Applications, SOCA 2010, 13-15 December 2010, Perth, Australia. pp. 1-4 (2010).
     
  • Böhm, C., Naumann, F., Freitag, M., George, S., Höfler, N., Köppelmann, M., Lehmann, C., Mascher, A., Schmidt, T.: Linking open government data: what journalists wish they had known.Proceedings the 6th International Conference on Semantic Systems (I-SEMANTICS), Graz, Austria (2010).
     
  • Böhm, C., Lorey, J., Fenz, D., Kny, E., Pohl, M., Naumann, F.: Creating voiD Descriptions for Web-Scale Data.Billion Triples Challenge (BTC) at the 9th International Semantic Web Conference (ISWC). , Shanghai, China (2010).
    Winner
     
  • AbuJarour, M., Naumann, F.: Dynamic tags for dynamic data web services.Proceedings of the 5th Workshop on Emerging Web Services Technology (WEWST), Ayia Napa, Cyprus, Greecc. pp. 3-9 (2010).
     
  • AbuJarour, M., Naumann, F., Craculeac, M.: Collecting, Annotating, and Classifying Public Web Services.On the Move to Meaningful Internet Systems: OTM - Confederated International Conferences: CoopIS, IS, DOA and ODBASE, Hersonissos, Crete, Greece, Proceedings, Part I. pp. 256-272 (2010).
     
  • Lorey, J., Naumann, F.: Towards Granular Data Placement Strategies for Cloud Platforms.Proceedings of the 6th International Conference on Granular Computing (GrC). pp. 346-351. , San Jose, California, USA (2010).
     
  • Naumann, F., Herschel, M.: An Introduction to Duplicate Detection.Morgan & Claypool Publishers (2010).
     
  • Dong, X.L., Naumann, F.: 13th international workshop on the web and databases: WebDB 2010.SIGMOD Record.39,37-39 (2010).
     

2009

  • Albrecht, A., Naumann, F.: METL: Managing and Integrating ETL Processes.Proceedings of the VLDB PhD Workshop. Co-located with the 35th International Conference on Very Large Data Bases (VLDB), Lyon, France (2009).
     
  • Böhm, C., Groth, P., Leser, U.: Graph-Based Ontology Construction from Heterogenous Evidences.Proceedings of the International Semantic Web Conference (ISWC). pp. 91-96 (2009).
     
  • Vogel, T., Kaufer, F., Naumann, F.: Encapsulating Multi-stepped Web Forms as Web Services.Proceedings of the 7th International Conference on Service-Oriented Computing (ICSOC). pp. 488-497 (2009).
     
  • Rostin, A., Albrecht, O., Bauckmann, J., Naumann, F., Leser, U.: A Machine Learning Approach to Foreign Key Discovery.Proceedings of the International Workshop on the Web and Databases (WebDB). , Providence, RI (2009).
     
  • AbuJarour, M., Craculeac, M., Menge, F., Vogel, T., Schwarz, J.-F.: POSR: A Comprehensive System for Aggregating and Using Web Services (demo).Proceedings of the IEEE Services Cup 2009 at IEEE International Conference on Web Services (ICWS) (2009).
     
  • Draisbach, U., Naumann, F.: A Comparison and Generalization of Blocking and Windowing Algorithms for Duplicate Detection.Proceedings of the International Workshop on Quality in Databases (QDB). , Lyon, France (2009).
     
  • Naumann, F., Raschid, L.: Guest Editorial for the Special Issue on Data Quality in Databases.Journal on Data and Information Quality (JDIQ).1, (2009).
     
  • Dong, X.L., Naumann, F.: Data fusion - Resolving Data Conflicts for Integration (tutorial).Proceedings of the VLDB.2,1654-1655 (2009).
     

2008

  • Jacob, M., Kuscher, A., Thiele, C., Plauth, M.: Automated data augmentation services using text mining, data cleansing and web crawling techniques.IEEE Congress on Services (2008).
     
  • Albrecht, A., Naumann, F.: Managing ETL Processes.Proceedings of the International Workshop on New Trends in Information Integration, (NTII), Auckland, New Zealand. pp. 12-15 (2008).
     
  • Herschel, M., Naumann, F.: Scaling up duplicate detection in graph data.Proceedings of the ACM Conference on Information and Knowledge Management (CIKM). pp. 1325-1326. , Napa Valley, CA (2008).
     
  • Hose, K., Roth, A., Zeitz, A., Sattler, K.-U., Naumann, F.: A research agenda for query processing in large-scale peer data management systems.Information Systems (IS).33,597-610 (2008).
     
  • Weis, M., Naumann, F., Jehle, U., Lufter, J., Schuster, H.: Industry-scale duplicate detection.Proceedings of the VLDB.1,1253-1264 (2008).
     
  • Bleiholder, J., Naumann, F.: Data fusion.ACM Computing Surveys.41, (2008).
     

2007

  • Ganti, V., Naumann, F. eds: Proceedings of the Fifth International Workshop on Quality in Databases (QDB)., Vienna, Austria (2007).
     
  • Roth, A., Naumann, F.: System P: Completeness-driven Query Answering in Peer Data Management Systems (demo).Datenbanksysteme in Business, Technologie und Web (BTW), Aachen, Germany. pp. 625-628 (2007).
     
  • Führing, P., Naumann, F.: Emergent Data Quality Annotation And Visualization.Proceedings of the International Conference on Information Quality (ICIQ). pp. 424-430. , Cambridge, MA (2007).
     
  • Bauckmann, J., Leser, U., Naumann, F., Tietz, V.: Efficiently Detecting Inclusion Dependencies.Proceedings of the International Conference on Data Engineering (ICDE). pp. 1448-1450. , Istanbul, Turkey (2007).
     
  • Raschid, L., Vidal, M.E., Wu, Y., Naumann, F., Bleiholder, J.: Answering Top K Queries Efficiently with Overlap of Answers in Sources or Source Paths.Proceedings of the International Workshop on Information Integration on the Web (IIWeb) (2007).
     
  • Hipp, J., Müller, M., Hohendorff, J., Naumann, F.: Rule-Based Measurement Of Data Quality In Nominal Data.Proceedings of the International Conference on Information Quality (ICIQ). pp. 364-378. , Cambridge, MA (2007).
     
  • Albrecht, A., Naumann, F.: Networked PIM using PDMS.Proceedings of the International Workshop Networking Meets Databases (NetDB) (2007).
     
  • Naumann, F.: Schema- und Metadatenmanagement in Peer Data Management Systemen.Datenbanksysteme in Business, Technologie und Web (BTW), Workshop Proceedings. p. 3. , Aachen, Germany (2007).
     
  • Legler, F., Naumann, F.: A Classification of Schema Mappings and Analysis of Mapping Tools.Proceedings of Datenbanksysteme in Business, Technologie und Web (BTW). pp. 449-464. , Aachen, Germany (2007).
     
  • Bleiholder, J., Draba, K., Naumann, F.: FuSem - Exploring Different Semantics of Data Fusion (demo).Proceedings of the International Conference on Very Large Data Bases (VLDB). pp. 1350-1353. , Vienna, Austria (2007).
     
  • Naumann, F., Roth, A.: Peer-Daten-Management-Systeme - PDMS.Datenbank Spektrum.23, (2007).
     
  • Naumann, F.: Datenqualität.Informatik Spektrum.30,27-31 (2007).
     

2006

  • Leser, U., Naumann, F., Eckman, B.A. eds: Proceedings of the Data Integration in the Life Sciences Workshop (DILS).Springer (2006).
     
  • Bauckmann, J., Leser, U., Naumann, F.: Efficiently Computing Inclusion Dependencies for Schema Discovery.Proceedings of the International Conference on Data Engineering Workshops (ICDE workshops). , Atlanta, GA (2006).
     
  • Biswas, J., Naumann, F., Qiu, Q.: Assessing the Completeness of Sensor Data.Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA). pp. 717-732. , Singapore (2006).
     
  • Puhlmann, S., Weis, M., Naumann, F.: XML Duplicate Detection Using Sorted Neighborhoods.Proceedings of the International Conference on Extending Database Technology (EDBT), Munich, Germany. pp. 773-791 (2006).
     
  • Weis, M., Naumann, F.: Detecting Duplicates in Complex XML Data.Proceedings of the International Conference on Data Engineering (ICDE) Atlanta, GA. p. 109 (2006).
     
  • Bleiholder, J., Naumann, F.: Conflict Handling Strategies in an Integrated Information System.Proceedings of the International Workshop on Information Integration on the Web (IIWeb). , Edinburgh, UK (2006).
     
  • Hegewald, J., Naumann, F., Weis, M.: XStruct: Efficient Schema Extraction from Multiple and Large XML Documents.Proceedings of the International Conference on Data Engineering Workshops (ICDE), InterDB workshop. p. 81. , Atlanta, GA (2006).
     
  • Roth, A., Naumann, F., Hübner, T., Schweigert, M.: System P: Query Answering in PDMS under Limited Resources.Proceedings of the International Workshop on Information Integration on the Web (IIWeb). , Edinburgh, Scotland (2006).
     
  • Bleiholder, J., Khuller, S., Naumann, F., Raschid, L., Wu, Y.: Query Planning in the Presence of Overlapping Sources.Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 811-828. , Munich, Germany (2006).
     
  • Naumann, F., Roth, M.: Information Quality: How Good are Off-the-Shelf DBMS? In: Al-Hakim, L. (ed.) Information Quality Management: Theory and Applications. Idea Group Inc. (2006).
     
  • Leser, U., Naumann, F.: Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen.dpunkt (2006).
     
  • Naumann, F., Bilke, A., Bleiholder, J., Weis, M.: Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies.IEEE Data Engineering Bulletin.29,21-31 (2006).
     

2005

  • Saake, G., Sattler, K.-U., Naumann, F. eds: Datenbankspektrum - Daten- und Informationsqualität.dpunkt.verlag, Heidelberg (2005).
     
  • Höpfner, H., Saake, G., Naumann, F., Heuer, A. eds: Beitragsband zum Studierenden-Programm bei der 11. Fachtagung "Datenbanken für Business, Technologie and Web", GI Fachbereich Datenbanken und Informationssysteme, Karlsruhe.Universität Magdeburg, Fakultät für Informatik (2005).
     
  • Naumann, F., Gertz, M., Madnick, S.E. eds: Proceedings of the 2005 International Conference on Information Quality (MIT IQ Conference), Sponsored by Lockheed Martin, MIT, Cambridge, MA, USA, November 10-12, 2006.MIT (2005).
     
  • Heymann, S., Naumann, F., Rieger, P., Raschid, L.: Enhancing the Semantics of Links and Paths in Life Science Sources.ICDT Workshop on Database Issues in Biological Databases (DBiBD). , Edinburgh, Scotland (2005).
     
  • Weis, M., Naumann, F.: DogmatiX Tracks down Duplicates in XML.Proceedings of the ACM International Conference on Management of Data (SIGMOD), Baltimore, MD. pp. 431-442 (2005).
     
  • Weis, M., Naumann, F., Brosy, F.: A Duplicate Detection Benchmark for XML (and Relational) Data.Proceedings of the SIGMOD International Workshop on Information Quality for Information Systems (IQIS) (2005).
     
  • Bleiholder, J., Naumann, F.: Declarative Data Fusion - Syntax, Semantics, and Implementation.Proceedings of the International Conference on Advances in Databases and Information Systems (ADBIS). pp. 58-73. , Tallinn, Estonia (2005).
     
  • Leser, U., Naumann, F.: (Almost) Hands-Off Information Integration for the Life Sciences.Proceedings of the International Conference on Innovative Database Research (CIDR). pp. 131-143. , Asilomar, CA (2005).
     
  • Weis, M.: Fuzzy Duplicate Detection on XML Data.Proceedings of the VLDB PhD workshop (2005).
     
  • Mihaila, G.A., Naumann, F., Raschid, L., Vidal, M.-E.: A Data Model and Query Language to Explore Enhanced Links and Paths in Life Science Sources.Proceedings of the International Workshop on the Web & Databases (WebDB). pp. 133-138. , Baltimore, MD (2005).
     
  • Bilke, A., Bleiholder, J., Böhm, C., Draba, K., Naumann, F., Weis, M.: Automatic Data Fusion with HumMer (demo).Proceedings of the International Conference on Very Large Data Bases (VLDB). pp. 1251-1254. , Trondheim, Norway (2005).
     
  • Roth, A., Naumann, F.: Benefit and Cost of Query Answering in PDMS.Proceedings of the Databases, Information Systems, and Peer-to-Peer Computing Workshop (DBISP2P) Seoul, Korea. pp. 50-61 (2005).
     
  • Hernández, M.A., Popa, L., Ho, H., Naumann, F.: Clio: A Schema Mapping Tool for Information Integration.Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Networks (ISPAN). , Las Vegas, Nevada (2005).
     
  • Heese, R., Herschel, S., Naumann, F., Roth, A.: Self-Extending Peer Data Management.Datenbanksysteme in Business, Technologie und Web (BTW), Karlsruhe, Germany. pp. 165-174 (2005).
     
  • Bilke, A., Naumann, F.: Schema Matching using Duplicates.Proceedings of the International Conference on Data Engineering (ICDE). pp. 69-80. , Tokyo, Japan (2005).
     
  • Mielke, M., Müller, H., Naumann, F.: Ein Data-Quality-Wettbewerb.Datenbank-Spektrum.14,34-37 (2005).
     

2004

  • Lacroix, Z., Murthy, H., Naumann, F., Raschid, L.: Links and Paths through Life Sciences Data Sources.Humboldt-Universität zu Berlin, Institut für Informatik (2004).
     
  • Naumann, F.: Informationsintegration.Öffentliche Vorlesung an der Humboldt-Universität zu Berlin (2004).
     
  • Naumann, F., Scannapieco, M. eds: Proceedings of the International Workshop on Information Quality in Information Systems (SIGMOD Workshop).ACM, Paris, France (2004).
     
  • Lacroix, Z., Murthy, H., Naumann, F., Raschid, L.: Links and Paths through Life Sciences Data Sources.Proceedings of the International WorkshopData Integration in the Life Sciences (DILS). pp. 203-211. , Leipzig, Germany (2004).
     
  • Weis, M., Naumann, F.: Detecting Duplicate Objects in XML Documents.International Workshop on Information Quality in Information Systems (IQIS). pp. 10-19 (2004).
     
  • Heymann, S., Naumann, F., Raschid, L., Rieger, P.: Labeling and Enhancing Life Sciences Links.Proceedings of the International IEEE Computer Society Computational Systems Bioinformatics Conference (CSB). pp. 598-599. , Stanford, CA (2004).
     
  • Bleiholder, J., Naumann, F.: FUSE BY: Syntax und Semantik zur Informationsfusion in SQL.INFORMATIK, Band 1, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI). pp. 331-335. , Ulm, Germany (2004).
     
  • Naumann, F., Roth, M.: Information Quality: How Good Are Off-The-Shelf DBMS?Proceedings of the International Conference on Information Quality (ICIQ), Cambridge, MA. pp. 260-274 (2004).
     
  • Roth, A., Naumann, F.: Qualitäts- und Semantik-gesteuerte Anfragebearbeitung für Peer-basierte Datenmanagementsysteme (PDMS).INFORMATIK 2004 - Band 1, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI), Ulm, Germany. pp. 341-345 (2004).
     
  • Bleiholder, J., Naumann, F., Raschid, L., Vidal, M.E.: Querying Web-Accessible Life Science Sources: Which paths to choose?Proceedings of the International Workshop on Information Integration on the Web (IIWeb) (2004).
     
  • Naumann, F., Freytag, J.C., Leser, U.: Completeness of integrated information sources.Information Systems (IS).29,583-615 (2004).
     
  • Bleiholder, J., Lacroix, Z., Murthy, H., Naumann, F., Raschid, L., Vidal, M.-E.: BioFast: Challenges in Exploring Linked Life Science Sources.SIGMOD Record.33,72-77 (2004).
     
  • Naumann, F., Bleiholder, J., Weis, M.: Eine Übung zur Vorlesung Informationsintegration.Datenbank Spektrum.11,50-52 (2004).
     

2003

  • Neiling, M., Jurk, S., Lenz, H.-J., Naumann, F.: Object Identification Quality.Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS). , Siena, Italy (2003).
     
  • Müller, H., Naumann, F.: Data Quality in Genome Databases.Proceedings of the International Conference on Information Quality (ICIQ). pp. 269-284. , Cambridge, MA (2003).
     
  • Naumann, F., Freytag, J.-C., Leser, U.: Completeness of Information Sources.Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS). , Siena, Italy (2003).
     
  • Löser, A., Naumann, F., Siberski, W., Nejdl, W., Thaden, U.: Semantic Overlay Clusters within Super-Peer Networks.First International Workshop on Databases, Information Systems, and Peer-to-Peer Computing (DBISP2P). pp. 33-47. , Berlin, Germany (2003).
     
  • Lacroix, Z., Naumann, F., Raschid, L., Vidal, M.-E.: Exploring Life Sciences Data Sources.Proceedings of Workshop on Information Integration on the Web (IIWeb). pp. 203-208. , Acapulco, Mexico (2003).
     
  • Josifovski, V., Massmann, S., Naumann, F.: Super-Fast XML Wrapper Generation in DB2: A Demonstration.Proceedings of the International Conference on Data Engineering (ICDE). pp. 756-758. , Bangalore, India (2003).
     
  • Naumann, F., Capiello, C., Kashyap, V., Saake, G.: Information Quality Assessment and Measurement.Data Quality on the Web (2003).
     
  • Naumann, F.: Qualitätsgesteuerte Anfragebearbeitung für Integrierte Informationssysteme.it - Information Technology.45,55-58 (2003).
     

2002

  • Naumann, F., Ho, C.-T., Tian, X., Haas, L., Megiddo, N.: Attribute Classification Using Feature Analysis.IBM Almaden Research Center (2002).
     
  • Eckman, B., Hernandez, M., Ho, H., Naumann, F., Popa, L.: Schema Mapping and Data Integration with Clio (demo).Intelligent Systems for Molecular Biology (ISMB). , Edmonton, Canada (2002).
     
  • Hernández, M.A., Popa, L., Velegrakis, Y., Miller, R.J., Naumann, F., Ho, C.-T.: Mapping XML and Relational Schemas with Clio (demo).Proceedings of the International Conference on Data Engineering (ICDE). pp. 498-499. , San Jose, CA (2002).
     
  • Naumann, F., Ho, C.-T., Tian, X., Haas, L.M., Megiddo, N.: Attribute Classification Using Feature Analysis.Proceedings of the International Conference on Data Engineering (ICDE). p. 271. , San Jose, CA (2002).
     
  • Naumann, F., Häussler, M.: Declarative Data Merging with Conflict Resolution.Proceedings of the International Conference on Information Quality (ICIQ). pp. 212-224. , Cambridge, MA (2002).
     
  • Naumann, F.: Quality-Driven Query Answering for Integrated Information Systems.Springer (2002).
     
  • Andritsos, P., Fagin, R., Fuxman, A., Haas, L.M., Hernández, M.A., Ho, C.T.H., Kementsietsidis, A., Miller, R.J., Naumann, F., Popa, L., Velegrakis, Y., Vilarem, C., Yan, L.-L.: Schema Management.IEEE Data Eng. Bull.25,32-38 (2002).
     

2001

  • Naumann, F.: From Databases to Information Systems - Information Quality Makes the Difference.Proceedings of the International Conference on Information Quality (ICIQ). pp. 244-260. , Cambridge, MA (2001).
     

2000

  • Yerneni, R., Naumann, F., Garcia-Molina, H.: Maximizing Coverage of Mediated Web Queries.Stanford University, CA (2000).
     
  • Naumann, F., Rolker, C.: Assessment Methods for Information Quality Criteria.Humboldt-Universität zu Berlin, Institut für Informatik (2000).
     
  • Naumann, F., Freytag, J.-C.: Completeness of Information Sources.Humboldt-Universität zu Berlin, Institut für Informatik (2000).
     
  • Naumann, F.: Quality-driven Query Planning.EDBT PhD Workshop (2000).
     
  • Naumann, F., Rolker, C.: Assessment Methods for Information Quality Criteria.Proceedings of the International Conference on Information Quality (ICIQ). pp. 148-162. , Cambridge, MA (2000).
     
  • Naumann, F., Leser, U.: Cooperative Query Answering with Density Scores.Proceedings of the International Conference on Management of Data (COMAD). , Pune, India (2000).
     
  • Leser, U., Naumann, F.: Query Planning with Information Quality Bounds.Proceedings of the International Conference on Flexible Query Answering Systems (FQAS). pp. 85-94. , Warsaw, Poland (2000).