Prof. Dr. h.c. mult. Hasso Plattner

Bachelor's Project: Online Marketplace Simulation: A Testbed for Self-Learning Agents


Modern online marketplaces do require a noticeable amount of automation. Due to their accessibility, every trader has to keep track of possible competitors, substitution products, price changes, and special offers. Mismanagement with regard to product choice or pricing can diminish or bloat a trader’s commercial success. As many of those decisions have to be performed for many different products at the same time, they cannot be handled manually.

To automate decisions, currently mostly rule-based algorithms are used leaving much room for performance improvements as they have to be tuned manually. Reinforcement Learning (RL) and its segment Deep Reinforcement Learning cover many algorithms and methods provide promising alternatives. Deep RL is especially appealing due to its universal nature and scalability, as it is designed to tackle universal decision problems. The most notable disadvantage of such methods is their enormous demand for training data, which is often not available in practical applications. To overcome this limitation, simulations are key to pre-train agents before using them in practice. In addition, the performance of a trained agent’s pricing strategy can be tested in simulated environments as well.


The goal of the project is to develop a universal simulation platform for markets with varying numbers of merchants. Being able to run various market simulations is highly relevant for many firms such as SAP and its customers. As the platform is designed as a tool to support evaluation and research, aspects like configurability and ease of use are crucial. While the technology stack is left open for now, high compatibility to common simulation APIs (such as Gym, TF-Agents) is required. For more complex setups, communication protocols between different agents might have to be implemented as well.

The simulation should cover the interaction between customers and particularly competing merchants, including self-learning agents and their rule-based opponents. While the focus can be put on several different aspects, an adjustable customer behavior model (which determines each participant’s sales) has to be developed. The platform should generate sales and interaction data for each of the merchants, which can then in turn be fed to the self-learning agents. Monitoring tools are required to analyze each agent’s policy and their effects on the overall market. With the help of such simulations, we seek to study the competitiveness of self-adapting pricing tools and their long-term impact on market competitors and customers.


This Bachelor’s project will be a joint effort of HPI and the SAP Innovation Center Network. We expect a close collaboration between both parties. Project participants will regularly visit ICN offices to facilitate proper information exchange in both directions. The team will have access to state-of-the-art IT resources provided by HPI.


We require that the team members have a huge motivation to

  • work as a team
  • design a scalable simulation environment for research purposes
  • generate and propagate knowledge relevant to the project and its adjacent fields
  • understand business correlations and markets dynamics

Prior experiences in data analytics, machine learning or web front-end development are beneficial. A keen interest in expanding each one’s background of the ecommerce sector is recommended.


For questions and details visit us at the Villa, 2nd floor on Campus II, or send us an email:

Project Kick-off: Beginning of the semester (End of October)