Prof. Dr. Tilmann Rabl

Applications of AI in the Credit Information Business

Gjergji Kasneci, Uni Tübingen


Digitalisation poses major challenges for many financial information service providers. Changing customer demands and the need for high-quality, scalable real-time processing combined with growing data security and privacy requirements are just some of the challenges facing the industry. This presentation shows how AI, combined with human knowledge and expertise, can help meet many of these challenges.


After completing his PhD in Graph-based Mining and Retrieval at the Max Planck Institute for Computer Science in Saarbrücken, in 2009, Gjergji joined Microsoft Research in Cambridge, UK, as a postdoctoral researcher, where he worked on probabilistic inference in knowledge bases. In 2011, Gjergji joined the Hasso Plattner Institute in Potsdam, where he led the Web Mining and Analytics Research Group. Mid 2014, Gjergji joined SCHUFA Holding AG, where he currently holds the CTO Position. Since April 2018, Gjergji is also leading the Data Science and Analytics Research Group at the University of Tübingen.


by Angelika Wieck, Antonius Naumann, Marcin Zielinski, Nick Podratz, and Philip Weidenfeller

The article at hand discusses possible applications of artificial intelligence (AI) in the credit information business. It focuses on examples that describe how AI-based technologies can prevent fraud and help secure digital identities. The content is based on a talk given by Prof. Dr. Gjergji Kasneci within the lecture series on Practical Data Engineering at Hasso Plattner Institut (HPI) in Potsdam, Germany.

Gjergji completed his PhD in Graph-based Mining and Retrieval at the Max Planck Institute for Computer Science in Saarbrücken, in 2009. Afterwards he joined Microsoft Research in Cambridge, UK, as a postdoctoral researcher, where he worked on probabilistic inference in knowledge bases. In 2011, Gjergji joined the Hasso Plattner Institute in Potsdam, where he led the Web Mining and Analytics Research Group. Mid 2014, Gjergji joined SCHUFA Holding AG, where he currently holds the CTO Position. Since April 2018, Gjergji is also leading the Data Science and Analytics Research Group at the University of Tübingen.

From Manual Checks to Neural Networks in Truth Discovery

One application of AI in practice is to determine the trustworthiness of provided data, especially when encountering inconsistent data. In the context of Schufas business, that is a substantial challenge. They process millions of identity requests and any error might lead to outraging negative implications for their customers that need to be prevented by all means. At the same time, their contractual partners might deliver ambiguous information on for example the address of a customer. So how to decide which data (if any) is correct, especially when there is no ground truth?

A rather simple approach would be to employ a majority vote algorithm, which can be best explained by an example. Imagine you have to answer what the capital of Germany is. Four out of five persons you asked said Berlin, another one said Munich. You could deduct that with 80% certainty Berlin is the correct answer. Now you want to infer from the available answers the capital of Malawi – two of the five do not have an answer now, the others name Kasungu, Malawisa, and Lilongwe (Fig. 1). By now you only know, that you do not know the answer. How about we see into the history of each person and how reliably they answered these kinds of questions in the past, but having in mind that we do not know the answers to the questions we asked beforehand ourselves. Therefore, we compare the majority votes of previous answers with each answer a candidate gave. This helps to assess the trustworthiness of this candidate for such kinds of questions. That lets us weigh all respondents’ answers by knowing how likely each candidate gives us the correct information. After all, we choose the information with the so calculated highest confidence level.

Fig. 1 Naive Algorithm for Truth Discovery

The weighted majority vote is one of the basic ideas behind the Latent Truth Discovery with restricted Boltzmann Machines that Prof. Kaseci and his colleagues implemented at Schufa. Their Boltzmann Machine based approach recovers the reliabilities of the sources and the trustworthiness of the information from those sources in an unsupervised fashion. Deep Neural Networks, in contrast, are not suitable for these analyses as they need to be trained with a high quantity of data which is, however, normally not possible for the specific long-tail queries Schufa answers. The Boltzmann Machine is much more capable to learn from a few data points. An even more powerful approach is to combine this model with a neural network that regresses the parameters that are estimated by contrastive divergence. This improves the previous model but is not real-time adaptable.

Identification Of the Right User Profile

The challenge consists in making a high-quality identification as efficient and scalable as possible since the Schufa has to answer up to 1 Million queries per day. These queries that come from different customers are often ambiguous. For instance, Schufa might have to identify Michael Müller, who has a very common German name and who lives on a very long street. Retrieving the right data for this person is very hard since there might live various Michael Müllers on that same street. To deal with that issue, Schufa had a retrieval engine, that retrieved the most promising candidates and presented them manually to experienced Schufa experts. The problem with manual decision making was that it was costly, hardly scalable and introduced high latencies as the experts needed some time to make their decision. Especially the high latency is an issue since in the e-commerce world letting a customer wait for a few minutes is not bearable. The process, therefore, has to be automated.

And here comes the next challenge: if we automate that process, we need to do it in such a way, that decisions are more reliable than decisions made by humans! The best way to teach human behavior to a machine is through Deep Learning. And since the Schufa had archived all the manual decisions from the previous years from hundreds of experts, they had enough data to train the Deep Neural Network with.

Despite those results, Schufa retains human experts in place, who decide manually for those queries for which the uncertainty of the Deep Learning mechanism is high. It is important to maintain the continuous quality assurance process, in which aggregated decisions by human experts help reduce the uncertainty of machine decisions.

Fraud Detection: Revealing Theft of Digital Identities

Even if you identify a person very reliable, you cannot prevent fraud. Identity-based fraud causes losses in e-commerce as high as 2.4 billion euro per year in Germany (2014, recent numbers may be even higher).

When trying to detect fraud, you face the challenge that fraud patterns are not only complex but are also changing over time. To detect fraud, an approach is needed which is as flexible as the fraudsters when coming up with new fraud mechanisms.

Schufa detects fraud patterns by grouping large amounts of queries from different sectors by similarities. When analyzing those data sets, possible fraud patterns become more obvious by looking for manipulation in the data like changing birth dates, unknown identities, etc. While the information needed to detect fraudulent transactions comes from the stream of Schufa’s online queries, it is also crucial to group and analyze this information in real-time.

To efficiently index large amounts of queries, Schufa uses locality sensitive hashing on an n- month window of applications, allowing them to maintain tens of millions of queries in main memory and obtaining the subgroup of similar queries in real-time. Those groups are described by over one hundred variables, like past ordering frequency on an address, the number of queries in the group or the degree of manipulation to a single attribute like name, postal address and so on. Since they use a supervised system, generated from data with real fraud labels, the system can detect manipulation by calculating a manipulation score.

An AI System in Production

Fig. 2 Scaling of monolithic or microservice-based application

Moving a monolithic AI system from the lab environment to production requires additional steps to take. In production the workload will be higher, thus AI system needs to scale well. The monolithic architecture consists of modules that are hardwired together. In this case, to scale up, you need to replicate the whole system. When one hardwired component fails, you have to shut down the whole instance, adjust it, and turn it on again. To avoid this, in Schufa, they use microservice- based architectures, in which every component is defined as a microservice and communicates with the others. Thanks to this approach, components can be scaled or adjusted separately from the entire instance. For example, if you need more bin-hashing components in your AI system, you scale up only the microservice responsible for bin-hashing, without the need of touching any of the other components. This allows for a lot of flexibility and is a must-do when migrating AI systems to production.

Furthermore, Schufa, as a credit bureau, needs to do a lot of monitoring, quality assurance, logging, just to satisfy all the German and European law regulations. Hence, their system not only needs to be working but also needs to conduct all the mandatory tasks related to a credit information provider in Germany. It gives a conclusion that from a lab environment to a production environment there is a very long way for an AI system to go through.

Schufa developed a specialized prototyping platform for helping them to choose the finest machine learning algorithm for a particular process. The platform is called ScoreStudio. It works similar to AutoML, which runs several machine learning algorithms and tries to find the best predictive model possible for the data provided. ScoreStudio is lightweight (can work on a laptop), is implemented in c++, and specifically designed to work with Schufa processes (e.g. fraud prediction). You only need to point where your data is and it will do the rest, for example, the indexing of the data, or preprocessing of the features. It can provide a solution for 15 million data points in around 16 minutes and come up with several best predictive models (e.g. random forests). ScoreStudio has all kinds of statistical analysis, like accuracy, precision, recall, or F1. All this gives an idea of what kinds of models will work best for a particular process and the data. It is not 100% accurate in predicting the best model but can give a hint about what models to choose from to use in production.

Predictive Models (That We Can Understand)

Even though self-learning systems already yield impressive results and oftentimes surpass humans in classification tasks, the Schufa decided to keep humans involved in most decision making processes. The company depends heavily on the trust of its customers, which is based upon qualities that computers yet need to learn. Being transparent about their reasoning is one of them. Trained decision-making systems – other than humans – oftentimes lack the feature to explain why they arrived at a particular conclusion.

The complexity of deep-learning methods makes factorization of the end-result oftentimes infeasible. Political or religious biases that the machine may extract from training-data are usually not appreciated by the designers of the systems, but hard to come upon. Especially when trained on live-data excluding biased or faulty data is a major challenge and makes the system vulnerable to manipulation. Even data that is fully objective might infringe on cultural and moral conventions that are simply unknown to the computer.

Another issue arises from prescriptive inquiries. If the computer has problems reasoning about its output, how then should it help the human understand what he can do better? In the case of Schufa, customers tend to request clarification over what they can do to improve their situation or what they should rather avoid not to harm their current profile. Some countries even require their credit institutions to explain themselves to the customer when denying a loan.

The classical approach in the credit business sector is to use decision trees combined with logistic regression for the tree’s leaves. These two methods allowed to separate data points in the model into classes by some predicate. This predicate made it feasible to reason about some classification and allowed humans to review a decision later on. Despite this benefit in reasoning, separating data points into well-defined classes is ends up being a limitation in many applications.

To overcome this limitation, Kasneci and his team recently published an approach that employs another split criterion for the node models. His approach made classes separable, instead of separating them immediately. His model also yields results at a tree depth of 1 that compares in quality to separation-results of a tree depth of 3 when following the old approach.

Another new technique that allows for instance-based explainability relies on Deep Neural Networks. Kasneci’s approach computes a gradient at each node of the neural network, aggregates those gradients in a forward manner to then form a linear model in the input variables for some given point. This LICON-termed method allows to find features with strong influence, and also the direction of their influences. Applying this to the MNIST dataset for numbers, the outcome allowed the inspector to reconstruct why samples were classified the way they were.

Fig. 3 Kasneci's instance-based character recognition model as an example of instance-based explaianability. Blue areas strengthen recognition, red areas strengthen rejection

Finally, Kasneci briefly describes a method for counterfactual explainability that was published just this year. His method allows finding model agnostic counterfactual explanations, to generate counterfactual explanations consistent with the input data distribution and to derive similarity information from the data as well.

Kasneci closes with a summary in which he remarks a variety of applications for artificial intelligence within the credit information business, underlines the importance of collaborating with domain experts for continuous quality assurance and hints further at the necessity to improve explainability of AI to broaden the acceptance in society.