Modern online marketplaces do require a noticeable amount of automation. Due to their accessibility, every trader has to keep track of possible competitors, substitution products, price changes, and special offers. Mismanagement with regard to product choice or pricing can diminish or bloat a trader’s commercial success. As many of those decisions have to be performed for many different products at the same time, they cannot be handled manually.
To automate decisions, currently mostly rule-based algorithms are used leaving much room for performance improvements as they have to be tuned manually. Reinforcement Learning (RL) and its segment Deep Reinforcement Learning cover many algorithms and methods provide promising alternatives. Deep RL is especially appealing due to its universal nature and scalability, as it is designed to tackle universal decision problems. The most notable disadvantage of such methods is their enormous demand for training data, which is often not available in practical applications. To overcome this limitation, simulations are key to pre-train agents before using them in practice. In addition, the performance of a trained agent’s pricing strategy can be tested in simulated environments as well.