Semantic Shortcuts to Digital Regulatory Reporting

If you’ve been following the work of the FCA’s RegTech team over the last few years, you’ll be aware of the cutting-edge research they (and a cross-industry project team) have been doing to explore the art of the possible for Digital Regulatory Reporting (DRR). In collaboration with the FCA, we convened an industry roundtable in December last year to explore the benefits and challenges of DRR for regulated firms, and banks in particular

During the course of our initial event, it became obvious that we had only scratched the surface of the complexities, challenges and promise posed by DRR and there was appetite for a continuation of the industry conversation and to explore several topics in more depth. We therefore decided to hold a follow on event in June of 2019 to focus on the actions that banks can take to realise some of the benefits of DRR even sooner.

The key premise for this session was that by using granular data models in combination with a semantic data layer, banks will not have to wait until the physical and operational infrastructure for DRR is in place before they can streamline aspects of regulatory reporting, making it more efficient, less risky, less costly and more flexible.

Separating the DRR layers

We can understand the future DRR process as taking place across three layers:

  • Physical – the technological infrastructure and software
  • Operational – the operating model / roles and responsibilities / governance
  • Abstract – the conceptual data model

The DRR pilot has focused on all three layers – and it is likely that the timeline for the development of the different layers will differ, not least because of some of the legal and policy issues identified as a result of the FCA’s DRR pilot project.

For the physical layer, FCA DRR has explored the feasibility of producing machine executable code from a software perspective, including this in smart contracts disseminated via distributed ledger technology. Firms would then use their own node of the blockchain to demonstrate compliance with the reporting requirements. 

In regards to the operational layer the DRR team looked to industry initiatives such as AuRep (Austrian Reporting Services) a unique regulatory reporting utility co-owned by the largest banks in Austria and developed in conjunction with the Austrian Central Bank.  Whilst steps have been taken in both of these areas it is clear that this is a complex topic and that there are many issues still to wrestle with, both operationally and in regard to how DRR will be physically implemented.

One of the key findings from the DRR pilot was the need for further investigation into creating granular data models for regulatory reporting for the end-to-end digital process. This would mean that data from source systems could be seamlessly processed all the way through to the regulator.

Given that semantic technology and data models operate at the abstract layer, creating a conceptual / abstract data model could realise some of the benefits of DRR today – consistency, efficiency, de-duplication, simplification –  rather than depending on the full end-to-end physical and / or operational layers being in place.

Creating a granular, semantic driven data model that sits on top of the many fragmented and siloed data sources will not only enable the seamless regulatory reporting once the physical and operational layers are in place, it can also help to solve a number of challenges faced by large financial firms today in fulfilling their regulatory reporting obligations, responding to regulatory change, managing their data and providing them with congruence across their systems.

What do we really mean by granular when it comes to data models?

The granularity of a data model can mean different things to different people and to illustrate this point, we asked our roundtable participants to guess how many individual data items would need to be associated with a vanilla retail mortgage to fulfil all the regulatory reporting requirements for this product. Guesses ranged from 70 to 380 – and an illustrative data model such as that used by BearingPoint has between 80 and 180 such data attributes. Granular therefore means the lowest possible ‘atomic’ level that a product (or counterparty or transaction) can be broken down to.

Findings from the DRR project suggest that regulatory reporting may not always require the lowest level of data to be reported – but that some sort of intermediate aggregation may also be required. Nevertheless, having modeled, identified and stored data at the lowest atomic level will significantly streamline today’s regulatory reporting processes and  ready firms for tomorrow’s fully digital process.

What do we mean by semantics?

The second building block to future proof regulatory reporting processes is the use of semantic technology. Leona O’Brien from the Governance Risk and Compliance Competence Centre at University College Cork gave us a masterclass on semantics and explained that whilst the world of semantics can seem a little daunting, at the most basic level semantics is simply about meaning, and how we construct the meaning of the world around us. 

We share an understanding about these meanings which is another way of saying that semantics uses meta-data that describe contextually relevant or domain specific information about content based on a shared metadata model. These models are also known as an ontologies which are a set of concepts and categories in a subject area or domain that shows their properties and the relations between them.

“The Semantic Web isn’t inherently complex. The Semantic Web language, at its heart, is very, very simple. It’s just about the relationships between things”

Tim Berners Lee

Using semantic techniques, we can model the domain of regulation and regulatory reporting and create an ontology that can better standardise data across a firm and even across the industry. Critically, an ontology will provide the ability to remove ambiguity between different data entities, because it relies on mapping both concepts and the nature of the relationships between them. These ontologies then lay the foundations for the use of sophisticated data analytics, natural language processing, smart contracts and other artificial intelligence techniques.

So what ‘shortcuts’ can we take today?

Mark Shead from BearingPoint explains:

“Whilst the promise and benefits of DRR for the Financial Services industry are extremely significant, so are some of the practicalities and complications posed by implementation”

 This is particularly so when we consider the aforementioned ‘Operational’ and ‘Physical’ layers that need to be in place along with the need for the creation of a common granular data model.  In fact when you begin to think of the scale of change this will require, it may seem that we are all some way off from realising any of these benefits.

However what we sought to explain in the roundtable is this needn’t be the case.  BearingPoint’s Abacus solution has been proven to operate at the most granular level of detail, thus we today have a target to begin mapping firms systems into the future state DRR. Coupling this model with the semantic concepts and technology we have in place, firms not only get the benefit of laying the foundations for the future but at the same time we are able to provide them with their own semantic model of their systems, thus shedding new light on their data, enabling them to begin untangling some of the complexity that exists today and providing a level of congruence across their estate that data warehousing has traditionally sought to solve.

BearingPoint and Model Drivers have not only demonstrated the art of the possible with this approach but are now able to provide firms with practical steps they can take today to realising the benefits.  

This is further explained in the diagram below:

  1. By creating a semantically enriched model (e.g. BearingPoint Abacus) we provide firms with a target granular input data model to connect their systems to reports
  2. Simultaneously we create a semantic data model of the firms systems, providing them with immediate benefits
  3. Creating domain specific languages creates a vocabulary for a firm’s systems and domain
  4. Linking the models at a ‘meta-meta’ level creates an end to end model and congruence not only across the entire model but by nature across their entire modelled estate

Next Steps

The tools, technology and model to provide this abstraction and connection functionality were originally demonstrated at the first TechSprint ‘Unlocking Regulatory Reporting’ and have been further elaborated on and proven both as part of the FCA’s second TechSprint and the subsequent work by the DRR team. The next step will be to further assess the viability of creating a semantically enabled granular data model on a wider scale, ideally through the same type of cross-industry collaboration that has been so successful in the FCA’s DRR project. 


With its RegTech product line, BearingPoint is a leading international provider of innovative regulatory and risk technology (RegTech and RiskTech) and services across the regulatory value chain for Financial Services. BearingPoint solutions has over 700 regulatory and technical enthusiasts.

Model Drivers technology enables banks to integrate their data and the regulations that govern them. Model Drivers integrates the semantics of the data and regulations before integrating the actual data and regulations. Model Drivers works with industry regulators, consultants and banks.