ML For Regulatory Reporting Is Looking Like An Ace In The Hole: To Gain Data Quality And Reduce Effort, It Could Be The Next Smart Play

By: Abhishek Awasthi, Global Product Manager AI/ML and Head of EMEA Client Engagement, AxiomSL, Robert Lee, Executive Director, North America Client Engagement, AxiomSL, Alan Minkoff, Lead Data Scientist, AxiomSL, and Mordecai Weisel, Machine Learning Engineer, AxiomSL

ML For Regulatory Reporting looks Like An Ace In The Hole: To Gain Data Quality & Reduce Effort, It Could Be Next Smart PlayMachine learning (ML) along with a panoply of artificial intelligence (AI) related technologies are finally earning a permanent seat at the table in financial institutions – with beneficial uses emerging across functions – fraud detection, client onboarding, KYC, personal banking assistance, etc. More organizations are anteing up all the time.

Ante Up

Even in the more traditional risk and regulatory data and reporting arena, these technologies are finding favor. At a recent AxiomSL North America user conference, ML for regulatory reporting along with AI and robotic process automation (RPA) accounted for 19.5 percent of responses when attendees representing G SIBs, regional banks, and FBOs were asked to indicate which technologies their organizations would prioritize in the next two years.

Regulators themselves are also showing a keen interest, as seen in the October 2019 Bank of England and Financial Conduct Authority joint study, Machine Learning in UK Financial Services, and in their early 2020 launch of a public-private forum on AI.

But What Game To Join?

It is less clear, however, where ML for regulatory reporting should best play, and more importantly, if those uses will constitute winning hands in regulators’ eyes.

When considering possible uses, data quality quickly rises to the top. With regulators demanding evidence of detailed progress against BCBS 239 data quality and governance principles, CDOs are looking for ways to bolster the trustworthiness of their data overall, and are laser focused on presenting accurate data in their financial and regulatory required reporting.

A challenge before the industry, then, is to identify opportunities to leverage ML for regulatory reporting to benefit data quality and operating efficiency in a transparent, controlled manner. That means incorporating ML as early in the process as possible when it can do the most good – at the data ingestion stage before data enrichment for a particular regulatory process kicks off.

Holding A Bad Hand

Like perpetually drawing a weak hand of cards, organizations may often feel trapped on a proverbial hamster wheel as they struggle to keep up with daily demands to perform data quality checks on myriad datasets that feed their financial and regulatory processes. For example, on a dataset of only 250,000 loan records – smaller by far than a typical large dataset – it can easily take a bank’s 12-person team an hour each morning just to run basic data quality checks to flag erroneous data. The team then must adjust these records or push them back to source systems to be rectified. When repaired, of course, they have to be re-entered into the process.

And what about the errors not flagged? They may be caught by a validation process in the last leg, but they might also perpetuate into a regulatory submission. The cumulative time and effort involved around data quality, and possible ramifications of scrutiny don’t bear thinking about, but CDOs have to do exactly that.

ML for Regulatory Reporting Enters The Game

AxiomSL’s technology innovation team has many R&D projects under way. One of its recent experiments clearly demonstrates an attractive application of ML for regulatory reporting. Imagining how to optimize the daily data management process, the team sought to determine how ML could be used to identify anomalies in a dataset. And, once detected, the team wanted to take the test further by attempting to offer a value to correct each anomaly and do it in a manner that could be automated.

Dealing Anomaly Detection Cards

The team leveraged the ControllerView data management platform to run the ML scenarios. Tapping AxiomSL’s experience of working closely with clients on implementations, the team prepared a sample diversified loan dataset representing a cross section of retail and commercial loan vehicle types that are prevalent in most large financial institutions. Each test record encompassed a range of common loan data attributes contained in 52 columns.

They then collected a sample set of specific loan data errors that frequently surface in a bank’s normal data remediation process and introduced additional records into the dataset that contained the test anomalies as follows:

AxiomSL injects typical anomalies into loan dataset test of ML for regulatory reporting

The test dataset was comprised as summarized in the following table. (The records containing injected errors are highlighted.)

AxiomSL runs successful AI test to optimize data quality using ML for regulatory reporting to detect loan dataset anomalies

To provide a credible context and a means of validating the test, the team structured it to run within the context of one of the most important and complex U.S. Federal Reserve regulatory reports.

The FR Y-9C is a primary analytical tool used to monitor financial institutions between on-site inspections. The form contains more schedules than any of the FR Y-9 series of reports and is the most widely requested and reviewed report at the holding company level, covering BHCs, SLHCs, IHCs, and SHCs and collecting basic financial data on a consolidated basis in the form of a balance sheet, an income statement, and detailed supporting schedules, including a schedule of off-balance-sheet items.

Using the reporting context of the FR Y-9C on the ControllerView platform, the team leveraged the data tracing capabilities of AxiomSL’s LineageView platform-native data lineage module to trace exactly where the erroneous records landed in FR Y-9C line items. This enabled the team to validate that the test records including the error records, landed where expected, as they would in an actual reporting cycle.

Result: A Winning Hand

The ML anomaly detection/correction prediction algorithm produced excellent results surpassing the team’s hoped-for outcomes. The test correctly:

  1. Identified all the records that had been modified with incorrect values.
  2. Highlighted the field most likely to have been a problem in each of those records. (Each highlighted field exactly matched the field that had been modified.)
  3. Returned a value that should replace the existing value to make the record not-anomalous, correctly predicting the value that had existed in the record before the mutation was introduced.


In summary, based on the information in a typical loan dataset, the ML test made an intelligent comparison against common values for a specific loan record type and, most importantly, offered a value most likely to be correct.

In real life, this outcome would significantly foreshorten the lengthy data remediation cycle.

Cashing In

Embedding ML in the regulatory system in this manner opens powerful possibilities. Being able to streamline the entire data management process both engenders a virtuous data-quality circle and enhances operational efficiency – the stuff of most CDOs’ dreams.

This particular ML idea has the potential for wide beneficial application because it tackles data remediation at the point of ingestion into the financial and regulatory data management system. Thereby, ML anomaly detection/correction prediction would enable firms to enhance the quality of data needed for multiple purposes while reducing the effort required to achieve it.

Lastly, this kind of outlier detection could also enable unforeseen analytical insights that inform planning and drive business growth.

A Smart Play

From a real-life perspective, today’s regulatory- and bank-driven controls would probably require people to ratify the ML-recommended correction value demonstrated by this experiment. Over time, the algorithm would get smarter and its accuracy more measurable. Firms would gain trust in the proposed values, and ultimately, perhaps, allow the ML-proposed correction to be made automatically.

In terms of working confidently in concert with ML for regulatory reporting, firms using AxiomSL’s ControllerView transparent platform would be ahead of the game. The platform systematically records and tracks such change activities and enables users to monitor them and, under proper access controls, easily inject human judgment when necessary.

Of course, financial institutions would expect to defend any ML-based process they embed into their regulatory systems; likewise, regulators would want to inspect such algorithms and process logic. However, when encompassed within the transparent data integrity and control environment that ControllerView provides, it seems likely that firms could reasonably expect such an ML capability to withstand scrutiny and gain regulatory approval.

Meanwhile, as innovation continues, the industry is carefully watching what regulators are saying about ML and looking forward to more specific guidance.

An Ace-In-The-Hole

Exciting as it is, the anomaly detection/correction prediction use-case presented here is only one example of how ML for regulatory reporting and data management might play very well. AxiomSL’s R&D team is working on many other remarkably interesting possibilities to use AI-related technologies to improve the regulatory reporting process for financial institutions.

But, from AxiomSL’s point of view, ML certainly looks like an ace-in-the-hole – an application with the potential to improve data quality and operating efficiency at the very front end of the process in a transparent, controlled environment. And the flexible, technologically forward ControllerView platform is poised to natively incorporate such innovations seamlessly.

It seems that bets on the usefulness of ML and other AI-related technologies in this space will pay off, perhaps sooner than anticipated.

We look forward to exploring the possibilities around ML for regulatory reporting with you. Talk to us here.

Related Articles:


We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept