Prof. Dr. Jürgen Döllner

Software DNA

In today's software development process, day to day development on software systems is digitally captured (project management, bug tracking, version control, build and test pipelines). This data can be extracted and processed using process mining and software data mining techniques. This data is complex in its structure and massive in volume, as well as subject to constant change. Analytical or visual results for monitoring or managing software development processes can then be derived from this data.

The publicly funded project "Software DNA", a joint project between HPI and Seerene, investigates the use of Machine Learning and Visual Analytics techniques on software data. On the basis of a continuous evaluation of characteristics of complex software systems and the IT development processes associated with them, a statistical model is derived that enables a formal consideration of various issues. Through subsequent predictive and prescriptive analyses, knowledge of the generated knowledge base is effectively applied to software development processes in industry. Novel visualization methods and asssociated rendering techniques are used to gain insights into the results.  Especially similarity of software modules based on their "Software DNA" is captured and forms the basis of effective layouting. This can be used, for example, to address the following specific use cases

  • Uncover which code frequently contains errors and slows down developer productivity.
  • Recognize outstanding performing teams and transfer their best practice processes to the entire workforce.
  • Uncover risks from complex code known only to a single developer (knowledge monopoly).

This project is funded by the European Regional Development Fund (ERDF – or EFRE in German) and the State of Brandenburg (ILB).