Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Publications (sorted in inverse chronological order)

2019

  • Bleifuß, T., Bornemann, L., Kalashnikov, D.V., Naumann, F., Srivastava, D.: DBChEx: Interactive Exploration of Data and Schema Change. Proceedings of the Conference on Innovative Data Systems Research (CIDR) (2019).
     
  • Kruse, S., Kaoudi, Z., Quiané-Ruiz, J.-A., Chawla, S., Naumann, F., Contreras-Rojas, B.: Optimizing Cross-Platform Data Movement. Proceedings of the International Conference on Data Engineering (ICDE) (2019).
     
  • Schirmer, P., Papenbrock, T., Kruse, S., Naumann, F., Hempfing, D., Mayer, T., Neuschäfer-Rube, D.: DynFD: Functional Dependency Discovery in Dynamic Datasets. Proceedings of the International Conference on Extending Database Technology (EDBT). p. 253--264 (2019).
     
  • Naumann, F.: The relational database management systems genealogy. In: Brodie, M.L. (ed.) Making Databases Work. pp. 173-179. ACM / Morgan & Claypool (2019).
     
  • Risch, J., Krestel, R.: Toxic Comment Detection in Online Discussions. In: Agarwal, B. (ed.) Deep learning based approaches for sentiment analysis. Springer (2019).
     
  • Jain, N., Krestel, R.: Who is Mona L.? Identifying Mentions of Artworks in Historical Archives. TPDL2019. (2019).
     
  • Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification. Data Technologies and Applications. 53, 108-122 (2019).
     
  • Kellermeier, T., Repke, T., Krestel, R.: Mining Business Relationships from Stocks and News. MIDAS@ECML-PKDD. (2019).
     
  • Risch, J., Krestel, R.: Measuring and Facilitating Data Repeatability in Web Science. Datenbank-Spektrum. 19, 117-126 (2019).
     
  • Jiang, L., Naumann, F.: Holistic Primary Key and Foreign Key Detection. Journal of Intelligent Information Systems. (2019).
     

2018

  • Risch, J., Krestel, R.: Aggression Identification Using Deep Learning and Data Augmentation. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 150-158 (2018).
     
  • Repke, T., Krestel, R.: Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks. 40th European Conference on Information Retrieval (ECIR 2018). Springer, Grenoble, France (2018).
     
  • Risch, J., Krestel, R.: Learning Patent Speak: Investigating Domain-Specific Word Embeddings. Proceedings of the Thirteenth International Conference on Digital Information Management (ICDIM) (2018).
     
  • van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for Toxic Comment Classification: An In-Depth Error Analysis. Proceedings of the 2nd Workshop on Abusive Language Online (co-located with EMNLP). pp. 33-42 (2018).
     
  • Exeler, C., Graber, M., Junge, T., Ramson, S., Ramson, C., Tschirschnitz, F., Naumann, F.: Piggyback Profiling: Enhancing Query Results with Metadata. Lernen. Wissen. Daten. Analysen. (LWDA) (2018).
     
  • Risch, J., Garda, S., Krestel, R.: Book Recommendation Beyond the Usual Suspects: Embedding Book Plots Together with Place and Time Information. Proceedings of the 20th International Conference On Asia-Pacific Digital Libraries (ICADL). pp. 227-239 (2018).
     
  • Repke, T., Krestel, R., Edding, J., Hartmann, M., Hering, J., Kipping, D., Schmidt, H., Scordialo, N., Zenner, A.: Beacon in the Dark: A System for Interactive Exploration of Large Email Corpora. Proceedings of the International Conference on Information and Knowledge Management (CIKM). p. 1--4. ACM (2018).
     
  • Loster, M., Hegner, M., Naumann, F., Leser, U.: Dissecting Company Names using Sequence Labeling. Proceedings of the Conference "Lernen, Wissen, Daten, Analysen". pp. 227-238 (2018).
     
  • Loster, M., Repke, T., Krestel, R., Naumann, F., Ehmueller, J., Feldmann, B., Maspfuhl, O.: The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities. Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling (DSMM 2018). ACM (2018).
     
  • Pietrangelo, A., Simonini, G., Bergamaschi, S., Naumann, F., Koumarelas, I.: Towards Progressive Search-driven Entity Resolution. SEBD (2018).
     
  • Loster, M., Naumann, F., Ehmueller, J., Feldmann, B.: CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text. Proceedings of the ACM International Conference on Information and Knowledge Management. pp. 1883-1886. ACM (2018).
     
  • Risch, J., Krebs, E., Löser, A., Riese, A., Krestel, R.: Fine-Grained Classification of Offensive Language. Proceedings of GermEval (co-located with KONVENS). pp. 38-44 (2018).
     
  • Risch, J., Krestel, R.: Delete or not Delete? Semi-Automatic Comment Moderation for the Newsroom. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 166-176 (2018).
     
  • Bunk, S., Krestel, R.: WELDA: Enhancing Topic Models by Incorporating Local Word Contexts. Joint Conference on Digital Libraries (JCDL 2018). ACM, Forth Worth, Texas, USA (2018).
     
  • Ambroselli, C., Risch, J., Krestel, R., Loos, A.: Prediction for the Newsroom: Which Articles Will Get the Most Comments? Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). pp. 193-199. ACL, New Orleans, Louisiana, USA (2018).
     
  • Lazaridou, K., Gruetze, T., Naumann, F.: Where in the World Is Carmen Sandiego? Detecting Person Locations via Social Media Discussions. Proceedings of the ACM Conference on Web Science. ACM (2018).
     
  • Risch, J., Krestel, R.: My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections. Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL). pp. 283-292 (2018).
     
  • Berti-Equille, L., Harmouch, H., Naumann, F., Novelli, N., Thirumuruganathan, S.: Discovery of Genuine Functional Dependencies from Relational Data with Missing Values. Proceedings of the VLDB Endowment (PVLDB). pp. 880-892 (2018).
     
  • Repke, T., Krestel, R.: Topic-aware Network Visualisation to Explore Large Email Corpora. International Workshop on Big Data Visual Exploration and Analytics (BigVis). CEUR-WS.org (2018).
     
  • Abedjan, Z., Golab, L., Naumann, F., Papenbrock, T.: Data Profiling. Morgan & Claypool Publishers (2018).
     
  • Agrawal, D., Chawla, S., Kaoudi, Z., Kruse, S., Quiané-Ruiz, J.A., Contreras-Rojas, B., Elmagarmid, A., Idris, Y., Lucas, J., Mansour, E., Ouzzani, M., Papotti, P., Tang, N., Thirumuruganathan, S., Troudi, A.: RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! -. Proceedings of the VLDB Endowment (PVLDB). 11, (2018).
     
  • Bornemann, L., Bleifuß, T., Kalashnikov, D., Naumann, F., Srivastava, D.: Data Change Exploration using Time Series Clustering. Datenbank-Spektrum. 18, 1-9 (2018).
     
  • Koumarelas, I., Kroschk, A., Mosley, C., Naumann, F.: Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection. J. Data and Information Quality. 10, 8:1--8:16 (2018).
     
  • Bleifuß, T., Bornemann, L., Johnson, T., Kalashnikov, D.V., Naumann, F., Srivastava, D.: Exploring Change - A New Dimension of Data Analytics. Proceedings of the VLDB Endowment (PVLDB). 12, 85-98 (2018).
     
  • Kruse, S., Naumann, F.: Efficient Discovery of Approximate Dependencies. Proceedings of the VLDB Endowment. 11, 759-772 (2018).
    See abstract for errata
     
  • Sadiq, S., Dasu, T., Dong, X.L., Freire, J., Ilyas, I.F., Link, S., Miller, R.J., Naumann, F., Zhou, X., Srivastava, D.: Data Quality – The Role of Empiricism. SIGMOD Record. 46, 35-43 (2018).
     

2017

  • Kruse, S., Hahn, D., Walter, M., Naumann, F.: Metacrate: Organize and Analyze Millions of Data Profiles. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 2483-2486. ACM (2017).
     
  • Repke, T., Loster, M., Krestel, R.: Comparing Features for Ranking Relationships Between Financial Entities Based on Text. Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets. p. 12:1--12:2. ACM, New York, NY, USA (2017).
     
  • Zuo, Z., Loster, M., Krestel, R., Naumann, F.: Uncovering Business Relationships: Context-sensitive Relationship Extraction for Difficult Relationship Types. Proceedings of the Conference "Lernen, Wissen, Daten, Analysen" (LWDA) (2017).
     
  • Maschler, F., Niephaus, F., Risch, J.: Real or Fake? Large-Scale Validation of Identity Leaks. 47. Jahrestagung der Gesellschaft für Informatik (INFORMATIK). pp. 2437-2448 (2017).
     
  • Harmouch, H., Naumann, F.: Cardinality Estimation: An Experimental Survey. Proceedings of the VLDB Endowment (PVLDB). pp. 499 - 512 (2017).
     
  • Krestel, R., Risch, J.: How Do Search Engines Work? A Massive Open Online Course with 4000 Participants. Proceedings of the Conference Lernen, Wissen, Daten, Analysen. pp. 259-271 (2017).
     
  • Papenbrock, T., Naumann, F.: Data-driven Schema Normalization. Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 342-353 (2017).
     
  • Papenbrock, T., Naumann, F.: A Hybrid Approach for Efficient Unique Column Combination Discovery. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 195-204 (2017).
     
  • Abedjan, Z., Golab, L., Naumann, F.: Data Profiling (tutorial). Proceedings of the International Conference on Management of Data (SIGMOD) (2017).
     
  • Bleifuß, T., Johnson, T., Kalashnikov, D.V., Naumann, F., Shkapenyuk, V., Srivastava, D.: Enabling Change Exploration (Vision). Proceedings of the Fourth International Workshop on Exploratory Search in Databases and the Web (ExploreDB). pp. 1-3 (2017).
     
  • Gruetze, T., Krestel, R., Lazaridou, K., Naumann, F.: What was Hillary Clinton doing in Katy, Texas? Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, 3-7 April, 2017. ACM (2017).
     
  • Risch, J., Krestel, R.: What Should I Cite? Cross-Collection Reference Recommendation of Patents and Papers. Proceedings of the International Conference on Theory and Practice of Digital Libraries (TPDL). pp. 40-46 (2017).
     
  • Kruse, S., Papenbrock, T., Dullweber, C., Finke, M., Hegner, M., Zabel, M., Zöllner, C., Naumann, F.: Fast Approximate Discovery of Inclusion Dependencies. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 207-226 (2017).
     
  • Loster, M., Zuo, Z., Naumann, F., Maspfuhl, O., Thomas, D.: Improving Company Recognition from Unstructured Text by using Dictionaries. Proceedings of the International Conference on Extending Database Technology. pp. 610-619 (2017).
     
  • Lazaridou, K., Krestel, R., Naumann, F.: Identifying Media Bias by Analyzing Reported Speech. International Conference on Data Mining. IEEE (2017).
     
  • Naumann, F., Krestel, R.: Das Fachgebiet „Informationssysteme“ am Hasso-Plattner-Institut. Datenbankspektrum. 17, 69-76 (2017).
     
  • Giesler, M.J., Keller, B., Repke, T., Leonhart, R., Weis, J., Muckelbauer, R., Rieckmann, N., Müller-Nordhorn, J., Lucius-Hoene, G., Holmberg, C.: Effect of a Website That Presents Patients' Experiences on Self-Efficacy and Patient Competence of Colorectal Cancer Patients: Web-Based Randomized Controlled Trial. J Med Internet Res. 19, e334 (2017).
     
  • Bleifuß, T., Kruse, S., Naumann, F.: Efficient Denial Constraint Discovery with Hydra. Proceedings of the VLDB Endowment (PVLDB). 11, 311-323 (2017).
     
  • Heller, D., Krestel, R., Ohler, U., Vingron, M., Marsico, A.: ssHMM: Extracting Intuitive Sequence-Structure Motifs from High-Throughput RNA-Binding Protein Data. Nucleic Acid Research. 45, 11004--11018 (2017).
     
  • Tschirschnitz, F., Papenbrock, T., Naumann, F.: Detecting Inclusion Dependencies on Very Many Tables. ACM Transactions on Database Systems (TODS). 42, 18:1-18:29 (2017).
     

2016

  • Krestel, R., Mottin, D., Müller, E. eds: Proceedings of the Conference "Lernen, Wissen, Daten, Analysen", Potsdam, Germany, September 12-14, 2016. CEUR-WS.org (2016).
     
  • Papenbrock, T., Naumann, F.: A Hybrid Approach to Functional Dependency Discovery. Proceedings of the International Conference on Management of Data (SIGMOD). pp. 821-833. ACM, New York, NY, USA (2016).
     
  • Godde, C., Lazaridou, K., Krestel, R.: Classification of German Newspaper Comments. Proceedings of the Conference Lernen, Wissen, Daten, Analysen. pp. 299-310. CEUR-WS.org (2016).
     
  • Gruetze, T., Krestel, R., Naumann, F.: Topic Shifts in StackOverflow: Ask it like Socrates. Lecture Notes in Computer Science. p. 213--221. Springer (2016).
     
  • Jenders, M., Krestel, R., Naumann, F.: Which Answer is Best? Predicting Accepted Answers in MOOC Forums. Proceedings of the 25th International Conference Companion on World Wide Web. pp. 679-684. International World Wide Web Conferences Steering Committee (2016).
     
  • Samiei, A., Naumann, F.: Cluster-based Sorted Neighborhood for Efficient Duplicate Detection. International Conference on Data Mining Workshops (ICDMW) (2016).
     
  • Samiei, A., Koumarelas, I., Loster, M., Naumann, F.: Combination of Rule-based and Textual Similarity Approaches to Match Financial Entities. DSMM. ACM (2016).
     
  • Grundke, M., Jasper, J., Perchyk, M., Sachse, J.P., Krestel, R., Neves, M.: TextAI: Enhancing TextAE with Intelligent Annotation Support. Proceedings of the 7th International Symposium on Semantic Mining in Biomedicine (SMBM 2016). pp. 80-84. CEUR-WS.org (2016).
     
  • Ehrlich, J., Roick, M., Schulze, L., Zwiener, J., Papenbrock, T., Naumann, F.: Holistic Data Profiling: Simultaneous Discovery of Various Metadata. Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 305-316. OpenProceedings.org (2016).
     
  • Bleifuß, T., Bülow, S., Frohnhofen, J., Risch, J., Wiese, G., Kruse, S., Papenbrock, T., Naumann, F.: Approximate Discovery of Functional Dependencies for Large Datasets. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 1803-1812. ACM, New York, NY, USA (2016).
     
  • Park, J., Blume-Kohout, M., Krestel, R., Nalisnick, E., Smyth, P.: Analyzing NIH Funding Patterns over Time with Statistical Text Analysis. Scholarly Big Data: AI Perspectives, Challenges, and Ideas (SBD 2016) Workshop at AAAI 2016. AAAI (2016).
     
  • Ziawasch Abedjan, L.G., Naumann, F.: Data Profiling (tutorial). International Conference on Data Engineering (ICDE) (2016).
     
  • Agrawal, D., Ba, L., Berti-Equille, L., Chawla, S., Elmagarmid, A., Hammady, H., Idris, Y., Kaoudi, Z., Khayyat, Z., Kruse, S., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Zaki, M.J.: Rheem: Enabling Multi-Platform Task Execution (demo). Proceedings of the ACM SIGMOD conference (SIGMOD) (2016).
     
  • Kruse, S., Jentzsch, A., Papenbrock, T., Kaoudi, Z., Quiane-Ruiz, J.-A., Naumann, F.: RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets. Proceedings of the International Conference on Management of Data (SIGMOD). pp. 953-967. ACM, New York, NY, USA (2016).
     
  • Lazaridou, K., Krestel, R.: Identifying Political Bias in News Articles. International Conference on Theory and Practice of Digital Libraries. IEEE Technical Committee on Digital Libraries (2016).
    TPDL Doctoral Consortium
     
  • Kruse, S., Papenbrock, T., Harmouch, H., Naumann, F.: Data Anamnesis: Admitting Raw Data into an Organization. IEEE Data Engineering Bulletin. 39, 8-20 (2016).
     
  • Gruetze, T., Kasneci, G., Zuo, Z., Naumann, F.: CohEEL: Coherent and Efficient Named Entity Linking through Random Walks. Web Semantics: Science, Services and Agents on the World Wide Web. 37, 75--89 (2016).
     
  • Naumann, F., Krestel, R.: The Information Systems Group at HPI. SIGMOD Record. (2016).
     
  • Langer, P., Naumann, F.: Efficient Order Dependency Discovery. VLDB Journal. 25, 223-241 (2016).
     

2015

  • Hennig, P., Berger, P., Dullweber, C., Finke, M., Maschler, F., Risch, J., Meinel, C.: Social Media Story Telling. Proceedings of the 8th IEEE International Conference on Social Computing and Networking (SocialCom2015). pp. 279-284. , Chengdu, China (2015).
     
  • Jenders, M., Lindhauer, T., Kasneci, G., Krestel, R., Naumann, F.: A Serendipity Model For News Recommendation. KI 2015: Advances in Artificial Intelligence - 38th Annual German Conference on AI, Dresden, Germany, September 21-25, 2015, Proceedings. pp. 111-123. Springer (2015).
     
  • Schubotz, T., Krestel, R.: Online Temporal Summarization of News Events. Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). pp. 679-684. IEEE Computer Society (2015).
     
  • Gruetze, T., Yao, G., Krestel, R.: Learning Temporal Tagging Behaviour. Proceedings of the 24th International Conference on World Wide Web Companion (WWW). p. 1333--1338. ACM (2015).
     
  • Kruse, S., Papenbrock, T., Naumann, F.: Scaling Out the Discovery of Inclusion Dependencies. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 445-454 (2015).
     
  • Jentzsch, A., Mühleisen, H., Naumann, F.: Uniqueness, Density, and Keyness: Exploring Class Hierarchies. In Proceedings of 6th International Workshop on Consuming Linked Data (COLD 2015), ISWC 2015. , Bethlehem, PA, USA (2015).
     
  • Jentzsch, A., Dullweber, C., Troiano, P., Naumann, F.: Exploring Linked Data Graph Structures. In Proceedings of Posters and Demos Session, ISWC2015. , Bethlehem, PA, USA (2015).
     
  • Schmidt, D., Frohnhofen, J., Knebel, S., Meinel, F., Perchyk, M., Risch, J., Striebel, J., Wachtel, J., Baudisch, P.: Ergonomic Interaction for Touch Floors. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. pp. 3879-3888. ACM, Seoul, Republic of Korea (2015).
     
  • Kruse, S., Papotti, P., Naumann, F.: Estimating Data Integration and Cleaning Effort. Proceedings of the International Conference on Extending Database Technology (EDBT) (2015).
     
  • Roick, M., Jenders, M., Krestel, R.: How to Stay Up-to-date on Twitter with General Keywords. Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB. CEUR-WS.org (2015).
     
  • Krestel, R., Werkmeister, T., Wiradarma, T.P., Kasneci, G.: Tweet-Recommender: Finding Relevant Tweets for News Articles. Proceedings of the 24th International World Wide Web Conference (WWW). ACM (2015).
     
  • Papenbrock, T., Ehrlich, J., Marten, J., Neubert, T., Rudolph, J.-P., Schönberg, M., Zwiener, J., Naumann, F.: Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms. Proceedings of the VLDB Endowment. 8, 1082-1093 (2015).
     
  • Papenbrock, T., Bergmann, T., Finke, M., Zwiener, J., Naumann, F.: Data Profiling with Metanome. Proceedings of the VLDB Endowment. 8, 1860-1871 (2015).
     
  • Papenbrock, T., Kruse, S., Quiane-Ruiz, J.-A., Naumann, F.: Divide & Conquer-based Inclusion Dependency Discovery. Proceedings of the VLDB Endowmen. 8, 774-785 (2015).
     
  • Papenbrock, T., Heise, A., Naumann, F.: Progressive Duplicate Detection. IEEE Transactions on Knowledge and Data Engineering (TKDE). 27, 1316-1329 (2015).
     
  • Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: An Extensible Logical Optimizer for UDF-heavy Data Flows. Information Systems. 52, 96-125 (2015).
     
  • Abedjan, Z., Golab, L., Naumann, F.: Profiling relational data: a survey. VLDB Journal. 24, 557-581 (2015).
     
  • Krestel, R., Dokoohaki, N.: Diversifying Customer Review Rankings. Neural Networks. 66, 36-45 (2015).
     

2014

  • Abedjan, Z., Gruetze, T., Jentzsch, A., Naumann, F.: Profiling and Mining RDF Data with ProLOD++. Proceedings of the IEEE International Conference on Data Engineering (ICDE), Demo. , Chicago, IL (2014).
     
  • Meyer, A., Pufahl, L., Batoulis, K., Kruse, S., Lindhauer, T., Stoff, T., Fahland, D., Weske, M.: Data Perspective in Process Choreographies: Modeling and Execution. 26th International Conference on Advanced Information Systems Engineering. , Thessaloniki, Greece (2014).
     
  • Zuo, Z., Kasneci, G., Gruetze, T., Naumann, F.: BEL: Bagging for Entity Linking. 25th International Conference on Computational Linguistics (COLING). , Dublin, Ireland (2014).
     
  • Gruetze, T., Kasneci, G., Zuo, Z., Naumann, F.: Bootstrapping Wikipedia to Answer Ambiguous Person Name Queries. 10th International Workshop on Information Integration on the Web (IIWeb). , Chicago, IL (2014).
     
  • Vogel, T., Naumann, F.: Semi-Supervised Consensus Clustering: Reducing Human Effort. Proceedings of the International Workshop on Data Integration and Applications (2014).
     
  • Heise, A., Kasneci, G., Naumann, F.: Estimating the Number and Sizes of Fuzzy-Duplicate Clusters. Proceedings of the Conference on Information and Knowledge Management (CIKM). pp. 959-968 (2014).
     
  • Rheinländer, A., Beckmann, M., Kunkel, A., Heise, A., Stoltmann, T., Leser, U.: Versatile optimization of UDF-heavy data flows with SOFA (demo). Proceedings of the SIGMOD conference. pp. 685-688 (2014).
     
  • Abedjan, Z., Naumann, F.: Amending RDF Entities with New Facts. Know@LOD Workshop in conjunction with ESWC. , Creete, Greece (2014).
    Selected for Best Workshop Paper Award.
     
  • Forchhammer, B., Jentzsch, A., Naumann, F.: LODOP - Multi-Query Optimization for Linked Data Profiling Queries. In Proceedings of the International Workshop on Dataset PROFIling & fEderated Search for Linked Data (PROFILES) in conjunction with ESWC. , Heraklion, Greece (2014).
    Selected for Best Workshop Paper Award.
     
  • Abedjan, Z., Schulze, P., Naumann, F.: DFD: Efficient Discovery of Functional Dependencies. In Proceedings of the International Conference on Information and Knowledge Management (CIKM), Shanghai, China. pp. 949-958 (2014).
     
  • Abedjan, Z., Quanie-Ruiz, J.-A., Naumann, F.: Detecting Unique Column Combinations on Dynamic Data. Proceedings of the IEEE International Conference on Data Engineering (ICDE). , Chicago, IL (2014).
     
  • Langer, P., Schulze, P., George, S., Kohnen, M., Metzke, T., Abedjan, Z., Kasneci, G.: Assigning Global Relevance Scores to DBpedia Facts. International Workshop on Data Engineering meets the Semantic Web (DESWeb). , Chicago, IL (2014).
     
  • Vogel, T., Heise, A., Draisbach, U., Lange, D., Naumann, F.: Reach for Gold: An Annealing Standard to Evaluate Duplicate Detection Results. JDIQ. 5, (2014).
     
  • Lorey, J.: Identifying and Determining SPARQL Endpoint Characteristics. International Journal of Web Information Systems. 10, (2014).
     
  • Krestel, R., Bergler, S., Witte, R.: Modeling human newspaper readers: The Fuzzy Believer approach. Natural Language Engineering. 20, 261--288 (2014).
     
  • Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal. (2014).
    Selected for 2014 Semantic Web journal outstanding paper award.
     
  • Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J.-C., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The Stratosphere Platform for Big Data Analytics. The VLDB Journal. 23, 939-964 (2014).
     

2013

  • Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: An Extensible Logical Optimizer for UDF-heavy Dataflows. (2013).
     
  • Leich, M., Adamek, J., Schubotz, M., Heise, A., Rheinlander, A., Markl, V.: Applying Stratosphere for Big Data Analytics. Database Systems for Business, Technology, and Web (BTW) (2013).
     
  • Lorey, J.: SPARQL Endpoint Metrics for Quality-Aware Linked Data Consumption. Proceedings of the 15th International Conference on Information Integration and Web-based Applications & Services (iiWAS '13). , Vienna, Austria (2013).
     
  • Albrecht, A., Naumann, F.: Systematic ETL Management – Experiences with High-Level Operators. Proceedings of the 18th International Conference on Information Quality (ICIQ). , Little Rock, AK (2013).
     
  • Lacoste-Julien, S., Palla, K., Davies, A., Kasneci, G., Graepel, T., Ghahramani, Z.: SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases. Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2013).
     
  • Lorey, J.: Storing and Provisioning Linked Data as a Service. Proceedings of the 10th Extended Semantic Web Conference (ESWC). , Montpellier, France (2013).
     
  • Lorey, J., Naumann, F.: Caching and Prefetching Strategies for SPARQL Queries. Proceedings of the 3rd International Workshop on Usage Analysis and the Web of Data (USEWOD). , Montpellier, France (2013).
    Selected as Best Workshop Paper for publication in ESWC post-proceedings
     
  • Lorey, J., Naumann, F.: Detecting SPARQL Query Templates for Data Prefetching. Proceedings of the 10th Extended Semantic Web Conference (ESWC). , Montpellier, France (2013).
     
  • Draisbach, U., Naumann, F.: On Choosing Thresholds for Duplicate Detection. Proceedings of the 18th International Conference on Information Quality (ICIQ). , Little Rock, USA (2013).
     
  • Jenders, M., Kasneci, G., Naumann, F.: Analyzing and Predicting Viral Tweets. Proceedings of the WWW '13 Companion: 22nd International World Wide Web Conference. , Rio de Janeiro, Brazil (2013).
     
  • Forchhammer, B., Papenbrock, T., Stening, T., Viehmeier, S., Draisbach, U., Naumann, F.: Duplicate Detection on GPUs. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 165-184 (2013).
    Runner Up for Best Paper Award
     
  • Heise, A., Quiane-Ruiz, J.-A., Abedjan, Z., Jentzsch, A., Naumann, F.: Scalable Discovery of Unique Column Combinations. Proceedings of the VLDB Endowment (PVLDB). , Hangzhou, China (2013).
    Jorge's presentation at VLDB 2014 was awarded the "Excellent Presentation Award".
     
  • Lange, D., Naumann, F.: Bulk Sorted Access for Efficient Top-k Retrieval. Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM). , Baltimore, Maryland (2013).
     
  • Lorey, J., Naumann, F.: Caching and Prefetching Strategies for SPARQL Queries. ESWC 2013 Satellite Events -- Revised Selected Papers. , Montpellier, France (2013).
     
  • Abedjan, Z., Naumann, F.: Synonym Analysis for Predicate Expansion. Proceedings of the Extended Semantic Web Conference (ESWC), Montpellier, France (2013).
     
  • Momtazi, S., Naumann, F.: Topic modeling for expert finding using latent dirichlet allocation. WIREs Data Mining and Knowledge Discovery. 3, 346–353 (2013).
     
  • Naumann, F.: Data Profiling Revisited. SIGMOD Record. 32, 40-49 (2013).
     
  • Rinser, D., Lange, D., Naumann, F.: Cross-lingual Entity Matching and Infobox Alignment in Wikipedia. Information Systems (IS). 38, 887–907 (2013).
     
  • Abedjan, Z., Naumann, F.: Improving RDF Data through Association Rule Mining. Datenbank-Spektrum (Special Issue on RDF Data Management). 13, 111--120 (2013).
     
  • Naumann, F., Jenders, M., Papenbrock, T.: Ein Datenbankkurs mit 6000 Teilnehmern - Erfahrungen auf der openHPI MOOC Plattform. Informatik-Spektrum. 37, 333-340 (2013).
     
  • Lange, D., Naumann, F.: Cost-Aware Query Planning for Similarity Search. Information Systems (IS). 38, 455--469 (2013).
     

2012

  • Draisbach, U., Naumann, F.: Adaptive Windows for Duplicate Detection. Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-143-1, ISSN 1613-5652
     
  • Albrecht, A., Naumann, F.: Understanding Cryptic Schemata in Large Extract-Transform-Load Systems. Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-201-8, ISSN 1613-5652
     
  • Bauckmann, J., Abedjan, Z., Leser, U., Müller, H., Naumann, F.: Covering or complete? : discovering conditional inclusion dependencies. Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2012).
    ISBN 978-3-86956-212-4, ISSN 1613-5652
     
  • Draisbach, U., Naumann, F., Szott, S., Wonneberg, O.: Adaptive Windows for Duplicate Detection. Proceedings of the 28th International Conference on Data Engineering (ICDE). , Washington, D.C., USA (2012).
     
  • Tafaj, E., Kasneci, G., Rosenstiel, W., Bogdan, M.: Bayesian online clustering of eye movement data. Proceedings of the 2012 Symposium on Eye-Tracking Research and Applications. pp. 285-288. ACM (2012).
     
  • Böhm, C., de Melo, G., Naumann, F., Weikum, G.: LINDA: Distributed Web-of-Data-Scale Entity Matching. Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii (2012).
     
  • Fenz, D., Lange, D., Rheinländer, A., Naumann, F., Leser, U.: Efficient Similarity Search in Very Large String Sets. Proceedings of the International Conference on Scientific and Statistical DatabaseManagement (SSDBM). , Chania, Crete, Greece (2012).
     
  • Böhm, C., Freitag, M., Heise, A., Lehmann, C., Mascher, A., Naumann, F., Hernandez, M., Ercegovac, V., Haase, P.: GovWILD: Integrating Open Government Data for Transparency (demo). Proceedings of the International World Wide Web Conference (WWW). , Lyon, France (2012).
     
  • Heise, A., Rheinländer, A., Leich, M., Leser, U., Naumann, F.: Meteor/Sopremo: An Extensible Query Language and Operator Model. Proceedings of the International Workshop on End-to-end Management of Big Data (BigData) in conjunction with VLDB 2012. , Istanbul, Turkey (2012).
     
  • Abedjan, Z., Lorey, J., Naumann, F.: Reconciling Ontologies and the Web of Data. Proceedings of the 21st International Conference on Information and Knowledge Management (CIKM). pp. 1532-1536. , Maui, Hawaii, USA (2012).
     
  • Vogel, T., Naumann, F.: Automatic Blocking Key Selection for Duplicate Detection based on Unigram Combinations. Proceedings of the 10th International Workshop on Quality in Databases (QDB) in conjunction with VLDB (2012).
     
  • Böhm, C., Kasneci, G., Naumann, F.: Latent Topics in Graph-Structured Data. Proceedings of the Conference on Information and Knowledge Management (CIKM) (2012).
     
  • Albrecht, A., Naumann, F.: Schema Decryption for Large Extract-Transform-Load Systems. Proceedings of the 31st International Conference on Conceptual Modeling (ER 2012). , Florence, Italy (2012).
     
  • Böhm, C., Hefenbrock, D., Naumann, F.: Scalable Peer-to-Peer-based RDF Management. Proceedings of the 8th Int. Conference on Semantic Systems. , Graz, Austria (2012).
     
  • Köppelmann, M., Lange, D., Lehmann, C., Marszalkowski, M., Naumann, F., Retzlaff, P., Stange, S., Voget, L.: Scalable Similarity Search with Dynamic Similarity Measures. Proceedings of the 6th International Workshop on Ranking in Databases (DBRank) in conjunction with VLDB. , Istanbul, Turkey (2012).
     
  • Bauckmann, J., Abedjan, Z., Müller, H., Leser, U., Naumann, F.: Discovering Conditional Inclusion Dependencies. Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii. pp. 2094-2098 (2012).
     
  • Gruetze, T., Böhm, C., Naumann, F.: Holistic and Scalable Ontology Alignment for Linked Open Data. Proceedings of the 5th Linked Data on the Web (LDOW) Workshop at the 21th International World Wide Web Conference (WWW). , Lyon, France (2012).
     
  • Kasneci, G.: Reasoning about Knowledge from the Web - (Extended Abstract). ICWE Workshops. pp. 186-188. Springer (2012).
     
  • Momtazi, S.: Fine-grained German Sentiment Analysis on Social Media. Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC). , Istanbul, Turkey (2012).
     
  • Draisbach, U.: Partitionierung zur effizienten Duplikaterkennung in relationalen Daten. Springer Vieweg (2012).
     
  • Beskales, G., Das, G., Elmagarmid, A.K., Ilyas, I.F., Naumann, F., Ouzzani, M., Papotti, P., Quiane-Ruiz, J., Tang, N.: The Data Analytics Group at the Qatar Computing Research Institute. SIGMOD Record. 41, (2012).
     
  • Heise, A., Naumann, F.: Integrating Open Government Data with Stratosphere for more Transparency. Web Semantics: Science, Services and Agents on the World Wide Web. 14, 45 - 56 (2012).
     
  • Abelló, A., Darmont, J., Etcheverry, L., Golfarelli, M., Mazón, J.-N., Naumann, F., Pedersen, T.B., Rizzi, S., Trujillo, J., Vassiliadis, P., Vossen, G.: Fusion Cubes: Towards Self-Service Business Intelligence. International Journal of Data Warehousing and Mining (IJDWM). 9, 66-88 (2012).
     
  • Herschel, M., Naumann, F., Szott, S., Taubert, M.: Scalable Iterative Graph Duplicate Detection. Transactions on Knowledge and Data Engineering (TKDE). 24, 2094-2108 (2012).
     

2011

  • Abedjan, Z., Naumann, F.: Advancing the Discovery of Unique Column Combinations. Hasso-Plattner-Institut für Softwaresystemtechnik an der Universität Potsdam (2011).
    ISBN 978-3-86956-148-6, ISSN 1613-5652
     
  • Lange, D., Naumann, F.: Frequency-aware Similarity Measures. Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM). p. 243--248. , Glasgow, Scotland, UK (2011).
     
  • Abedjan, Z., Naumann, F.: Context and Target Configurations for Mining RDF Data. International Workshop on Search & Mining Entity-Relationship Data (SMER), Glasgow, UK (2011).
     
  • AbuJarour, M., Naumann, F.: Improving Service Discovery through Enriched Service Descriptions. Datenbanksysteme für Business, Technologie und Web (BTW), 14. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2.-4.3.2011 in Kaiserslautern, Germany. pp. 706-709 (2011).
     
  • Lorey, J., Naumann, F., Forchhammer, B., Mascher, A., Retzlaff, P., ZamaniFarahani, A., Discher, S., Faehnrich, C., Lemme, S., Papenbrock, T., Peschel, R.C., Richter, S., Stening, T., Viehmeier, S.: Black Swan: Augmenting Statistics with Event Data. Proceedings of the 20th Conference on Information and Knowledge Management (CIKM). pp. 2517-2520. , Glasgow, UK (2011).
     
  • Abedjan, Z., Naumann, F.: Advancing the Discovery of Unique Column Combinations. Proceedings of the International Conference on Information and Knowledge Management (CIKM), Glasgow, UK (2011).
     
  • Böhm, C., Kny, E., Emde, B., Abedjan, Z., Naumann, F.: SPRINT: ranking search results by paths. International Conference on Extending Database Technology (EDBT), Uppsala, Sweden. pp. 546-549 (2011).
     
  • Lorey, J., Abedjan, Z., Naumann, F., Böhm, C.: RDF Ontology (Re-)Engineering through Large-scale Data Mining. Billion Triples Challenge (BTC) at the 10th International Semantic Web Conference (ISWC). , Koblenz, Germany (2011).
    Finalist
     
  • Lange, D., Naumann, F.: Efficient Similarity Search: Arbitrary Similarity Measures, Arbitrary Composition. Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM). p. 1679--1688. , Glasgow, Scotland, UK (2011).
     
  • Draisbach, U., Naumann, F.: A Generalization of Blocking and Windowing Algorithms for Duplicate Detection. Proceedings of the International Conference on Data and Knowledge Engineering (ICDKE). , Milan, Italy (2011).
     
  • Vogel, T., Naumann, F.: Instance-based "one-to-some" Assignment of Similarity Measures to Attributes. Proceedings of the 19th International Conference on Cooperative Information Systems (CoopIS) (2011).
     
  • Lange, D., Vogel, T., Draisbach, U., Naumann, F.: Projektseminar "Similarity Search Algorithms". Datenbank-Spektrum. 11, 51-57 (2011).