Prof. Dr. Felix Naumann

Dr. Arvid Heise

Former PhD student

Email: Arvid Heise

Research Activities

  • Cloud Computing
  • Parallel and Declarative Data Cleansing
  • MapReduce with Hadoop


  • A Hybrid Approach for Efficient Unique Column Combination Discovery. Papenbrock, Thorsten; Naumann, Felix (2017). 195–204.
  • Progressive Duplicate Detection. Papenbrock, Thorsten; Heise, Arvid; Naumann, Felix in IEEE Transactions on Knowledge and Data Engineering (TKDE) (2015). 27(5) 1316–1329.
  • SOFA: An Extensible Logical Optimizer for UDF-heavy Data Flows. Rheinländer, Astrid; Heise, Arvid; Hueske, Fabian; Leser, Ulf; Naumann, Felix in Information Systems (2015). 52 96–125.
  • Estimating the Number and Sizes of Fuzzy-Duplicate Clusters. Heise, Arvid; Kasneci, Gjergji; Naumann, Felix (2014). 959–968.
  • The Stratosphere Platform for Big Data Analytics. Alexandrov, Alexander; Bergmann, Rico; Ewen, Stephan; Freytag, Johann-Christoph; Hueske, Fabian; Heise, Arvid; Kao, Odej; Leich, Marcus; Leser, Ulf; Markl, Volker; Naumann, Felix; Peters, Mathias; Rheinländer, Astrid; Sax, Matthias J.; Schelter, Sebastian; Höger, Mareike; Tzoumas, Kostas; Warneke, Daniel in The VLDB Journal (2014). 23(6) 939–964.
  • Versatile optimization of UDF-heavy data flows with SOFA (demo). Rheinländer, Astrid; Beckmann, Martin; Kunkel, Anja; Heise, Arvid; Stoltmann, Thomas; Leser, Ulf (2014). 685–688.
  • Reach for Gold: An Annealing Standard to Evaluate Duplicate Detection Results. Vogel, Tobias; Heise, Arvid; Draisbach, Uwe; Lange, Dustin; Naumann, Felix in JDIQ (2014). 5(1-2)
  • Applying Stratosphere for Big Data Analytics. Leich, Marcus; Adamek, Jochen; Schubotz, Moritz; Heise, Arvid; Rheinlander, Astrid; Markl, Volker (2013).
  • Scalable Discovery of Unique Column Combinations. Heise, Arvid; Quiane-Ruiz, Jorge-Arnulfo; Abedjan, Ziawasch; Jentzsch, Anja; Naumann, Felix (2013).
  • SOFA: An Extensible Logical Optimizer for UDF-heavy Dataflows Rheinländer, Astrid; Heise, Arvid; Hueske, Fabian; Leser, Ulf; Naumann, Felix (2013). (Vol. abs/1311.6335)
  • Meteor/Sopremo: An Extensible Query Language and Operator Model. Heise, Arvid; Rheinländer, Astrid; Leich, Marcus; Leser, Ulf; Naumann, Felix (2012).
  • GovWILD: Integrating Open Government Data for Transparency (demo). Böhm, Christoph; Freitag, Markus; Heise, Arvid; Lehmann, Claudia; Mascher, Andrina; Naumann, Felix; Hernandez, Mauricio; Ercegovac, Vuk; Haase, Peter (2012).
  • Integrating Open Government Data with Stratosphere for more Transparency. Heise, Arvid; Naumann, Felix in Web Semantics: Science, Services and Agents on the World Wide Web (2012). 14(1) 45–56.