BNP Paribas Personal Finance has implemented Neo4j graphical DBMS to optimize fraud detection in consumer loans.
AdvertBNP Paribas Personal Finance is a subsidiary of the BNP Paribas group, specializing in consumer loans, providing split payment services, especially on e-commerce sites. To improve detection of fake files in these services, the company experimented with the use of a graph database called Neo4j, a technology that was put into production after providing accurate results. The first evaluation of this project was presented at the Big Data & IA 2022 fair.
Split payment services, which allow payment to be spread over three or four installments, are frequently targeted by fraudulent networks. They do not reuse information (names, phone or credit card numbers, etc.) from one file to another. They are changing these, which means traditional blacklist approaches no longer work, explains Mehdi Barchouchi, head of innovation data and tools in the French risk department at BNP Paribas Personal Finance. To improve the detection of fake files, it is sometimes necessary to be able to link files that do not contain common information. This brings with it a major requirement: Scoring had to be done in real time in order to instantly respond to the customer who submitted his file.
Excellent use case
The performance of traditional relational databases is inadequate for such operations. In fact, it is necessary to duplicate merges to detect the relationship between files; This is a particularly expensive procedure. Mehdi Barchouchi says the problem lies in the depth of the networks. BNP Paribas Personal Finance therefore decided to test a chart database; because this technology fits very well into a structure where multiple pieces of data are connected together. This is a perfect use case for Douard Tabary, head of the innovation and data science team at BNP Paribas Personal Finance scoring centre.
Neo4j’s solution was then selected and a pilot implementation was implemented with a reduced dataset on an on-premises server in 2020. The team first creates the graph data model from tabular data, then gradually refines it to achieve the target model, including the use of machine learning algorithms. Finally, it creates indicators based on the value of the predictions. We obtained a very effective model: by applying it to a small population, we covered almost all fraudster networks,” admits Douard Tabary.
Douard Tabary, head of the innovation and data science team at BNP Paribas Personal Finance scoring centre: We can track files without any shared information but in a way that connects them.
AdvertA complex project due to its real-time dimension
The next step: industrializing the model is a work that was started at the beginning of 2021 and completed with the go-live at the beginning of 2022. At this stage, the algorithm continued to be optimized, especially in real time. But most of the time was devoted to the design of a suitable architecture that was also hosted internally. During the course, we built a system to call the Neo4j infrastructure in real time, but this system always included a transactional part in order to preserve our ability to examine the data to improve the algorithm, Douard Tabary noted.
Now the data reaches the graph database directly and can be instantly compared with all past requests, with a response within a few milliseconds. Douard Tabary explains that this is how we can track two files without any common information but with a path connecting them. Once the groupings are identified, the team can find signs of potential fraud, specifically leveraging Neo4j’s similarity links. The goal is to have as few false positives as possible, but it is also necessary to understand the pathway that causes a file to receive a high fraud risk score in order to respond to the customer whose file has been rejected. We have an obligation to make the model explainable, to understand the signs of risk,” underlines Mehdi Barchouchi. We know how to track and capture graph-by-graph context. Data fingerprinting provides specific context as we obtain increasingly precise models. We can predict fraud by looking at the neighborhood, finding out what caused the prediction, and taking legal data retention periods into account. The team also ensures that the model is fair to avoid discriminatory bias.
A model ready to evolve
Other organizations of the BNP Paribas group use Neo4j for incident resolution, primarily in IT. However, the use of this technology is a first for the BNP Paribas Personal Finance team, which consists of two business experts, two data scientists and a small group of developers and managers. In this project, specifically linking IT and risk management, this cross-functional team benefited from the editor’s support to write the model and feed it with data. Mehdi Barchouchi says we are used to table formats and need to learn. The challenge today is to expand the knowledge base to be able to explore other use cases and reach other populations.
The team has already learned some lessons from this experience. According to Mehdi Barchouchi, to adopt such an approach, it is important to know your data well and have good examples of network fraud cases that you can find in the graph. He also recommends dedicating time to this, both during the discovery phase and once you’ve moved into production. This is just the beginning of the project’s life. The data and tools innovation manager at the French risk department insists that we need to be able to improve the model to be able to react to the activities of fraudsters. He adds without forgetting that performance should be measured with indicators.
Article written by:
Aurlie ChandezeCIO Deputy Editor-in-Chief
Follow the author Connected,
Share this article