Author: Brandon Smith
November 2020 – February 2021
FNA, a leader in supervisory and regulatory technology (Suptech and Regtech) recently took part in the Financial Conduct Authority’s Digital Sandbox Pilot innovation project. FNA entered this competition under the Fraud and Scams use cases and sought to bring a new approach to fraud or money laundering augmentation supported by machine learning enhanced network science.
Given that most compliance programs must first and foremost concentrate on remaining compliant with a considerable amount of regulation, thus causing a vast amount of unproductive or low risk case generation, FNA’s approach was considering how to best augment current financial corporation monitoring in a manner that provides a new way to look at risk permeation through networks and identification of anomalies. These outcomes can be statistically described and therefore provide a feedback loop to existing centralized monitoring systems in a scientific manner.
Above: 3225 entities of interest emanating from a single point of payments in FCA’s sandbox dataset. FNA’s approach reduces this complex network to 86 entities of interest based on enhanced network based risk scores and neural network based anomaly detection.
- Orange Entities: Those entities that state they are one industry party but behave as another
- Entity Size: The larger the entity – the more risk laden based on relationships to others in the extended financial payment network (pictured: top 75% of the enhanced risk score results)
- Entity Shape: The projected organizational segment determined by behavior found in the neural network results
Solution Part 1. Relationship Based Risk Scoring
Using network science, FNA determines that pre-existing risk scores for a given focal entity could be augmented and improved by determining how risk scores transfer to other members of a given payment network and therefore increase or decrease the risk of a focal entity.
PageRank was initially used by Google Search to rank web pages. It’s defined in a recursive way: a page linked with highly ranked pages receives a high rank itself. Applying the same algorithm on the network of businesses and biasing the rank of each node taking into account also the pre existing score, it is possible to assign a way of prioritizing the businesses also taking into account connections between businesses. If a business has a low risk score, but it’s connected with businesses with high risk scores it’s priority according to the personalized page rank algorithm is going to increase and it will be placed before other businesses with a similar risk score. It’s possible this way to prioritize the investigation of the single businesses.
Solution Part 2. Neural network based anomaly detection for segmentation non-conformity
Neural network analytics applied to identity management and signature identification can spot entities that purport to be one thing, but behave as another.
Individuals and organizations produce immense quantities of ‘exhaust’, either carelessly or inadvertently provided through daily activity. In image recognition we identify a subject starting from a photo. In this solution, the “image” of a company, individual, or group is going to be defined by their behavioral fingerprint!
We are able to derive significant features from the fingerprints of each organisation. We can use this fingerprint to identify entities in misrepresented segments due to a mismatch in expected behavior, resulting from either business adaptation or potential signature reduction betraying activity.
FCA Sandbox Results: Neural network based anomaly detection for segmentation non-conformity
Relationship Based Business Failure Risk Scoring with Non-Conforming Segment Anomaly Detection
Left: 4-degree network emanating from single entity. This entity is the “focal” ID of a compliance case.
- Distinct Entities: 3225
- Number of Network Degrees: 4
- Individual Transactions: 411,776
- Total Value: $37,018,941.45
- Business Segments: 17
With existing / traditional business risk scoring…
The top 10% highest risk score from FCA entities data (568 entities)
- The size of each entity is indicative of their traditional risk score.
- Difficult to discern differences between 90% and 100%
- 13% of the entities remain to investigate; all are traditionally high risk
Results with FNA’s Approach
Of the 3,225 original amount of entities in the extended payments network, FNA reduced the number of concerning entities by 99%, leaving 28 – 86 entities for further investigation. This is a 95% reduction from the 568 entities remaining in just the top 10% of entities of concern using traditional risk scoring.
Left Panel: All KYC or risk score variable pieces of information listed for each selected entity
Center Panel Panel: Top 28 nodes with highest delta between the originally provided entity risk score and the relationship based enhanced risk score
Network Visualization: top 86 entities connected to the original focal entity based on the top 90% of the new or enhanced risk scoring. 28 nodes are of high enhanced risk score concern and there are 9 entities of concern based on segmentation non-conformity.
- Network of interest with “focalID” of the case selected. All visual elements are configurable to any data point.
- Line Width = Average Transaction Amount
- Entity Color: Blue = Conforms to Segment; Orange = does not conform (anomaly)
- Shape of Entity = declared industry segment
The FNA Compliance and National Security Team would like to highlight the achievements of Riccardo Marcaccioli, Matteo Neri, and Anthony Hernandez (Data Scientists at FNA) in the completion of this project.