Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Machine Learning for Software Engineering (Wintersemester 2023/2024)

Lecturer: Prof. Dr. Holger Giese (Systemanalyse und Modellierung) , Christian Medeiros Adriano (Systemanalyse und Modellierung) , Iqra Zafar (Systemanalyse und Modellierung) , Matthias Barkowsky (Systemanalyse und Modellierung)

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 01.10.2023 - 31.10.2023
  • Examination time §9 (4) BAMA-O: 08.02.2024
  • Teaching Form: Project seminar
  • Enrolment Type: Compulsory Elective Module
  • Course Language: English

Programs, Module Groups & Modules

IT-Systems Engineering MA
Data Engineering MA
Digital Health MA
Software Systems Engineering MA
  • MALA: Machine Learning and Analytics
    • HPI-MALA-C Concepts and Methods
  • MALA: Machine Learning and Analytics
    • HPI-MALA-T Technologies and Tools
  • MODA: Models and Algorithms
    • HPI-MODA-C Concepts and Methods
  • MODA: Models and Algorithms
    • HPI-MODA-T Technologies and Tools
  • MODA: Models and Algorithms
    • HPI-MODA-S Specialization



Context - Motivation

Software development is known to be difficult to scale - the number of programmers grows non-linearly with the size of the systems. To make matters worse, system complexity tends to increase exponentially, because adding new components creates multiple new connections and orders of magnitude more paths for failure conditions.

To cope with this lack of scalability compounded with increasing complexity, various automated tools have been developed, both focused on the source code (e.g., code completion, bug finding, test generation) and on its execution (automated testing, debugging, and fixing). Many of these tools saw a quick and extensive adoption among developers, as they help reduce repetitive work, the search time for problems, while also serving as guardrails when part of continuous integration pipelines.

Nonetheless, these automated tools still did not provide the necessary orders of magnitude of improvement in quality and efficiency to ameliorate the lack of scalability and complexity of evolving software systems. A new "silver-bullet" has been in the making - the deep generative models, which hold the promise of removing developers from the loop of certain software engineering tasks. 

Goals of the Project Seminar:

  1. Provide an overview of the state-of-the-art of machine learning techniques for software engineering tasks.
  2. Discuss how the current trends might change the various software engineering roles - requirements engineer, systems architect, software developer, software tester, integration engineer, etc.
  3. Develop a prototype (proof of concept) focused on a particular software engineering task.

Ideas of projects (not exclusive):

  • Fine-Tune an LLM for a particular task within a specific context.
  • Train a decision-making model to prioritize and recommend tasks to be performed by a developer and/or Conversational Agent/Generative Model
  • Train a Deep/Graph Neural Network to categorize artifacts (code, tests, specifications) for later Inspection/Processing.
  • Train a Reinforcement Learning agent to follow instructions (in natural language) to perform a particular task on a software system. 
  • ?


Software Engineering 

  • Brooks, F. P., The Mythical Man-Month. - Essays in Software Engineering. Adisson-Wesley, Reading, Mass (1995).
  • Fowler, Martin, and Jim Highsmith. "The agile manifesto." Software development 9.8 (2001): 28-35.
  • Eppinger, Steven D., and Tyson R. Browning. Design structure matrix methods and applications. MIT press, (2012).
  • Dybå, Tore, and Torgeir Dingsøyr. "Empirical studies of agile software development: A systematic review." Information and software technology 50.9-10 (2008): 833-859.

Generative Models for Software Engineering Artifacts

  • Hou, Xinyi, et al. "Large Language Models for Software Engineering: A Systematic Literature Review." arXiv preprint arXiv:2308.10620 (2023).
  • Fan, Angela, et al. "Large Language Models for Software Engineering: Survey and Open Problems." arXiv preprint arXiv:2310.03533 (2023).
  • White, Jules, et al. "Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design." arXiv preprint arXiv:2303.07839 (2023).
  • Wu, Yonghao, et al. "Large Language Models in Fault Localisation." arXiv preprint arXiv:2308.15276 (2023).
  • Lu, Junyi, et al. "LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning (Practical Experience Report)." arXiv preprint arXiv:2308.11148 (2023).
  • Guo, Qi, et al. "Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study." arXiv preprint arXiv:2309.08221 (2023).
  • Yu, Shengcheng, et al. "LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities." arXiv preprint arXiv:2309.13574 (2023).
  • Cámara, Javier, et al. "On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML." Software and Systems Modeling (2023): 1-13.

Generative Models for Planning and Decision-Making Tasks

  • Qian, Chen, et al. "Communicative agents for software development." arXiv preprint arXiv:2307.07924 (2023).
  • Dagan, Gautier, Frank Keller, and Alex Lascarides. "Dynamic Planning with a LLM." arXiv preprint arXiv:2308.06391 (2023).
  • Hao, Shibo, et al. "Reasoning with language model is planning with world model." arXiv preprint arXiv:2305.14992 (2023).
  • Feng, Xidong, et al. "Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training." arXiv preprint arXiv:2309.17179 (2023).
  • Liu, Zhihan, et al. "Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency." arXiv preprint arXiv:2309.17382 (2023).
  • Jojic, Ana, Zhen Wang, and Nebojsa Jojic. "GPT is becoming a Turing machine: Here are some ways to program it." arXiv preprint arXiv:2303.14310 (2023).
  • Ye, Yining, et al. "Large Language Model as Autonomous Decision Maker." arXiv preprint arXiv:2308.12519 (2023).
  • Causality for Software Engineering

  • Siebert, Julien. "Applications of statistical causal inference in software engineering." Information and Software Technology (2023): 107198.
  • Yuan, Yuyu, Chenlong Li, and Jincui Yang. "An Improved Confounding Effect Model for Software Defect Prediction." Applied Sciences 13.6 (2023): 3459.
  • Bagherzadeh, Mojtaba, Nafiseh Kahani, and Lionel Briand. "Reinforcement learning for test case prioritization." IEEE Transactions on Software Engineering 48.8 (2021): 2836-2856.
  • Konrad, Michael, et al. "Causal Models for Software Cost Prediction and Control (SCOPE)." Software Engineering Institute Research Review (2019).

Software Engineering for Machine Learning

  • Amershi, Saleema, et al. "Software engineering for machine learning: A case study." 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 2019.
  • Serban, Alex, et al. "Adoption and effects of software engineering best practices in machine learning." Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 2020.
  • Martínez-Fernández, Silverio, et al. "Software engineering for AI-based systems: a survey." ACM Transactions on Software Engineering and Methodology (TOSEM) 31.2 (2022): 1-59.
  • Chouliaras, Georgios Christos, et al. "Best Practices for Machine Learning Systems: An Industrial Framework for Analysis and Optimization." arXiv preprint arXiv:2306.13662 (2023).


The course is a project seminar, which has an initial phase comprising a set of introductory lectures. After that, the students will work in groups on jointly identified experiments applying specific solutions to given problems and finally prepare a presentation and write a report about their findings concerning the experiments.

There will be an introductory phase to present basic concepts for the theme, including the necessary foundations.

Lectures will happen through Zoom from our seminar room. The students interested can also join face-to-face in the seminar room.


We will grade the group's reports (80%) and presentations (20%). Note that the report includes documenting the experiments and the obtained results. Therefore, the grading of the report includes the experiments. During the project phase, we will require participation in meetings and other groups' presentations in the form of questions and feedback to their peers.


The first lecture will take place on October 17, 2023 (Tuesday). The lectures will take place in room A-2.2 and remotely via Zoom link*

We will follow the recurrent schedule of:

  • Tuesday 17:00 - 18:30, Room A-2.2
  • Thursday 13:30 - 15:00, Room A-2.2

* Please email christian.adriano [at] hpi.de