Architecture for Fraud Detection with GraphDB
Graph databases provide new and exciting opportunities for advanced fraud detection and prevention systems. By design, graph databases uniquely give you a data store which both allows you to build transactional and analytical systems capable of uncovering fraud rings and preventing fraud in real-time.
Sophisticated fraud perpetrated using synthetic identities is near impossible to detect with linear based pattern matching on individual data points. It is the relationships between these data points that offer the opportunity to squash fraud before a transaction takes place, rather than detecting and contesting it after the fact. Unlike other solutions, graph databases provide this capability.
On every engagement, our customers first question to us is: "How can I tie this into my existing application infrastructure?". Here I will present two popular production architectures for graph powered Anti-Fraud that we have deployed at XN Logic.
Before we can make use of the graph, we must load some data. To support real-time fraud prevention, we need to load data continuously either by having it pushed into a graph API, or by reaching out and pulling it in from some other system.
One of the simplest forms of data loading is writing agents that listen to events from a existing applications and push new data into the graph.
e.g. PUT http://server.com/model/customer {"name": "Joe Bloggs", "SSN": "050-12-3456"}
In this scenario, you need to be able to not just push individual nodes and relationships one-by-one, but have the ability to load entire trees of information in one request, such that you can ingest both large and small amounts of data as quickly as possible.
This second form of data import requires you to define an action in your data model which goes about opening a file and performing an ETL operation to bring data into your graph.
Large files on a SAN is one tactic we have seen, but others include direct database access (via JDBC in our stack) where tables are queried directly with SQL; proprietary APIs - most have JVM drivers; and, thankfully, web services.
Once defined, these import actions can be triggered as needed via an API call or scheduled to run periodically.
e.g. POST /model/customer_data_source/id/1/action/import_data
With your data loaded, we can now reap oodles of value from your anti-fraud graph - be it programatically, visually, or my favourite: both!
The most popular architecture is to treat the graph as another decision point in your existing application. To support this, we define the graph traversals required to detect fraud and expose these via one or more RESTful actions.
e.g. GET /is/loan_application/id/123/report/fraud_score
2) Decisions and Visualization, together.
While detecting and preventing fraud in the backend is powerful, some like to take that fraud flag and present the current transaction to a fraud analyst for review. JavaScript graph visualization libraries like KeyLines allows us to rapidly provide a valuable view to the analyst for them to make the final call.
To enable this view to be embedded in their application, we provide an overview of the transaction and all pertinent relationships via a JSON document that can be immediately consumed by KeyLines.
e.g. GET /is/loan_application/id/123/report/network_view
The key point here is that it is possible to build and deploy real-time network analytics into your anti-fraud arsenal. To learn more about developing and deploying applications on the graph, drop me an email: [email protected].