Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Sebastian Kruse

former member

Research Interests

  • Data profiling
  • Distributed systems
  • Map/Reduce frameworks
  • Query optimization
  • Cross-platform/polyglot data processing

Projects

Teaching

Master's Theses

  • Estimating Metadata of Query Results using Histograms (Cathleen Ramson, 2014)
  • Quicker Ways of Doing Fewer Things: Improved Index Structures and Algorithms for Data Profiling (Jakob Zwiener, 2015)
  • Methods of Denial Constraint Discovery (Tobias Bleifuß, 2016)
  • Optimizing Cross-Platform Iterations on 
    the Rheem Platform (Jonas Kemper, ongoing)

Seminars

Master Projects

Bachelor Projects

Guest Lectures

Professional Activities

Talks

Publications

2020

  • 1.
    Kruse, S., Kaoudi, Z., Quiane-Ruiz, J.-A., Chawla, S., Naumann, F., Contreras-Rojas, B.: RHEEMix in the Data Jungle: A Cost-based Optimizer for Cross-Platform Systems. VLDB Journal. 29, 1287–1310 (2020).
     

2019

  • DynFD: Functional Depende... - Download
    1.
    Schirmer, P., Papenbrock, T., Kruse, S., Naumann, F., Hempfing, D., Mayer, T., Neuschäfer-Rube, D.: DynFD: Functional Dependency Discovery in Dynamic Datasets. Proceedings of the International Conference on Extending Database Technology (EDBT). pp. 253–264 (2019).
     
  • Optimizing Cross-Platform... - Download
    2.
    Kruse, S., Kaoudi, Z., Quiané-Ruiz, J.-A., Chawla, S., Naumann, F., Contreras-Rojas, B.: Optimizing Cross-Platform Data Movement. Proceedings of the International Conference on Data Engineering (ICDE). pp. 1642–1645 (2019).
     

2018

  • Efficient Discovery of Ap... - Download
    1.
    Kruse, S., Naumann, F.: Efficient Discovery of Approximate Dependencies. Proceedings of the VLDB Endowment. 11, 759–772 (2018).
     
  • RHEEM: Enabling Cross-Pla... - Download
    2.
    Agrawal, D., Chawla, S., Kaoudi, Z., Kruse, S., Quiané-Ruiz, J.A., Contreras-Rojas, B., Elmagarmid, A., Idris, Y., Lucas, J., Mansour, E., Ouzzani, M., Papotti, P., Tang, N., Thirumuruganathan, S., Troudi, A.: RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! -. Proceedings of the VLDB Endowment (PVLDB). 11, (2018).
     

2017

  • Efficient Denial Constrai... - Download
    1.
    Bleifuß, T., Kruse, S., Naumann, F.: Efficient Denial Constraint Discovery with Hydra. Proceedings of the VLDB Endowment (PVLDB). 11, 311–323 (2017).
     
  • Fast Approximate Discover... - Download
    2.
    Kruse, S., Papenbrock, T., Dullweber, C., Finke, M., Hegner, M., Zabel, M., Zöllner, C., Naumann, F.: Fast Approximate Discovery of Inclusion Dependencies. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 207–226 (2017).
     
  • Metacrate: Organize and A... - Download
    3.
    Kruse, S., Hahn, D., Walter, M., Naumann, F.: Metacrate: Organize and Analyze Millions of Data Profiles. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 2483–2486. ACM (2017).
     

2016

  • Approximate Discovery of ... - Download
    1.
    Bleifuß, T., Bülow, S., Frohnhofen, J., Risch, J., Wiese, G., Kruse, S., Papenbrock, T., Naumann, F.: Approximate Discovery of Functional Dependencies for Large Datasets. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 1803–1812. ACM, New York, NY, USA (2016).
     
  • Data Anamnesis: Admitting... - Download
    2.
    Kruse, S., Papenbrock, T., Harmouch, H., Naumann, F.: Data Anamnesis: Admitting Raw Data into an Organization. IEEE Data Engineering Bulletin. 39, 8–20 (2016).
     
  • RDFind: Scalable Conditio... - Download
    3.
    Kruse, S., Jentzsch, A., Papenbrock, T., Kaoudi, Z., Quiane-Ruiz, J.-A., Naumann, F.: RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets. Proceedings of the International Conference on Management of Data (SIGMOD). pp. 953–967. ACM, New York, NY, USA (2016).
     
  • Rheem: Enabling Multi-Pla... - Download
    4.
    Agrawal, D., Ba, L., Berti-Equille, L., Chawla, S., Elmagarmid, A., Hammady, H., Idris, Y., Kaoudi, Z., Khayyat, Z., Kruse, S., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Zaki, M.J.: Rheem: Enabling Multi-Platform Task Execution (demo). Proceedings of the ACM SIGMOD conference (SIGMOD) (2016).
     

2015

  • Divide & Conquer-based In... - Download
    1.
    Papenbrock, T., Kruse, S., Quiane-Ruiz, J.-A., Naumann, F.: Divide & Conquer-based Inclusion Dependency Discovery. Proceedings of the VLDB Endowmen. 8, 774–785 (2015).
     
  • Estimating Data Integrati... - Download
    2.
    Kruse, S., Papotti, P., Naumann, F.: Estimating Data Integration and Cleaning Effort. Proceedings of the International Conference on Extending Database Technology (EDBT) (2015).
     
  • Scaling Out the Discovery... - Download
    3.
    Kruse, S., Papenbrock, T., Naumann, F.: Scaling Out the Discovery of Inclusion Dependencies. Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW). pp. 445–454 (2015).
     

2014

  • 1.
    Meyer, A., Pufahl, L., Batoulis, K., Kruse, S., Lindhauer, T., Stoff, T., Fahland, D., Weske, M.: Data Perspective in Process Choreographies: Modeling and Execution. 26th International Conference on Advanced Information Systems Engineering. , Thessaloniki, Greece (2014).