B.I.S.S Research White Papers

Beware: Data Reporting Potholes Ahead in MiFIR and MiFID Reporting

Richard Robinson

Richard Robinson

A senior business executive with more than 25 years of experience in the financial industry, with a rare perspective that spans working in operations and technology positions at global custodial banks, international brokers, investment managers as well as core industry utilities.
As the financial services industry and regulators embrace standardization globally, we need to be more deliberate and conscious of what standards we embrace and for what purpose. To be clear, standards, as a general rule, are a good thing. However – standards, simply by virtue of being a standard, will not solve all your problems – and could end up being a source of creating bigger problems when improperly applied.

Standards can suffer from two major negatives. First, applying standards generically, without thought to context, nuance, interpretation, community, and language. Mandating single standards as broad-based solutions can stifle innovation and growth, and foster bad data. (ISITC is looking to address the language issues through the Reference Data Working Group – so get involved)

Second, false security and costly data related errors can proliferate where the implementation of a standard lacks rigor, data quality, and is controlled unilaterally with little to no real oversight – or with oversight that is simply window dressing with no enforcement ability.

There are concerning and growing problems with CFI that fall more in this latter area of implementation, which relates to CFI and ISIN assignment.

Last year, I published a piece looking at data issues specific to standards mandated for use in MiFID and MiFIR reporting. Specifically, the mandated use of ISIN, CFI, and FISN that are provided exclusively, and at ever increasing cost, by the Association of National Numbering Agencies, via its two Special Purpose Vehicles, the ANNA Service Bureau, and the ANNA Derivatives Service Bureau.

In “Cracks in the Data Road for MiFID II“, I presented a sampling of data where the CFI classification by the various Numbering Agencies was clearly wrong, two months after MiFID II mandates to use ANNA-created data went live. While some of these singular examples have since been fixed – only after continuous inquiry by industry data experts – the lack of data quality continues to persist.

As self-appointed “golden sources of truth,” there should be some expectation of National Numbering Agencies to have a standard of data quality that is significantly higher than is being provided. Instead, firms that are mandated to pay unchecked and non-transparent fees for Numbering Agency services are instead acting as free quality assurance services for the data they are forced to pay for in the first place.

And when the data is not fixed when requested, firms need to spend on more resources to create work arounds that do not improve transparency, but instead re-enforce the poor data being dumped into the industry’s core datasets.

We are seeing in MiFIR the impacts of bad data quality of CFI’s provided by the ANNA mechanisms cascade across reporting, resulting in rejected reports that should not be rejected, and re-enforcement and proliferation of incorrect data. The re-use and spread of this bad data propagates downstream, and will further muddy the data waters throughout the financial system. Worse – consumers downstream will assume the data is clean, when it is not, because it has been passed through one or more systems, obscuring the poor data source itself, or based on the false assumption that the source should be trusted.

In MiFIR, there is a conversion table from CFI to the MiFIR identification code – which drives data fields required for validation in reporting. When the CFI is incorrect, this means that a reporting firm will be required to provide data that does not exist, data that is obviously wrong, or even not report data that is relevant. If the firm tries to report using the proper classification that does not match the CFI validation, the report will be rejected. 

So in either case, a firm is forced to choose not between good or bad data, but instead which bad data they will provide in order to satisfy the regulatory requirement.

For example,

USY4211T1145 – (Global class FIGI BBG001S7GXS1) is a Global Depositary Receipt (MiFIR Identifier = DPRS), but it has been assigned a CFI code of ESXUFA, which specifies it as a Common/Ordinary Share, and hence requires that it be reported as SHRS. (note that this is traded on 6 different MTFs, and therefore have different exchange-level FIGIs that roll up and aggregate to FIGI BBG001S7GXS1).

Firms required to report will be rejected when classifying this correctly – as a GDR. But because ANNA has provided bad data, clients are forced to report transactions in this issue as SHRS. 

It is clear that a GDR is not a common ordinary share. But if a user downstream were to use the Common Stock classification data provided by ANNA, and given a level of legitimacy because of the claim of being a ‘standard’, the implications to trading, portfolio allocation decisions, corporate actions, asset servicing and more can be significant. GDR’s have very particular properties in comparison to Common Stock – and this mis-classification can have costly consequences in automated systems.

Further, for regulators looking at transparency, liquidity, concentration risks, currency exposures, and other issues – this mis-classification will provide them bad information.

These are not isolated issues. 

NL0011683594 (Global class FIGI BBG00CWTMW63) is an ETF (MiFIR Identifier = ETFS), but it has been provided a CFI Code of ESVUFN, which specifies it as a Common/Ordinary Share, and hence requires that it be reported as SHRS. Again, an ETF has specific qualities different than a common stock. 

(Also note that this is traded via 12 different venues, including OTC markets – detail you can see at OpenFIGI.com, and in 3 different currencies – not including quotes in GBP versus GBp).

IE0006447985  (Global class FIGI BBG001S62V48)- is a London-listed common stock (MiFIR Identifier = SHRS), but it carries a CFI Code of EMXXXR, which specifies it as a Miscellaneous Equity, requiring that it be reported as OTHR.

On some cursory research, this was probably issued as Unit Trust initially. But when it was delisted in Ireland and listed on London, this seemingly ceased to be the case. This is the record on the LSE: Norish PLC

And it differentiates from the record of another Irish ISIN that actually does trade as a Unit Trust: Grafton UT. (FIGI BBG001S87Z22)

So it would appear that the data from the NNA is stale, as it was probably correct at one time. But has not been properly maintained.

Data concerns exist regardless of asset classes.

DK0060096030 – (Global class FIGI BBG002ZS0XJ2) should be an Open-End Fund, but it has been provided a CFI Code of CEOIES, which specifies it as an Exchange-Traded Fund.

Similarly, LU1397788552 – (Global class FIGI BBG00CSJVZJ9) is also an Open-End Fund, but it has a CFI Code of CECGMX, which also specifies it be reported as ETFS.

The problem for both of these, then is that they should have a CFI Code starting with CI, corresponding to the proper classification for a mutual type fund, which would indicate that this instrument is out of scope for transparency, and should not even be reported. But this bad data point will force reporting.

There is a cost for firms to report, and then have that report be rejected. The reject needs to be researched, data corrected, and the report resubmitted. Manual intervention has high cost. When the intervention is simply to match bad data, the eventual result is for firms to just automate the use of the bad data instead. This is because there is no other feasible recourse, and the occurrence of rejects are for the same reason – bad data.

If these were anomalies, it would not be much of an issue. A handful of data errors among thousands would not be cause for concern. 

However, this set of examples are representative of thousands of occurrences, across different asset types. Thousands of errors among thousands of trading and reportable instruments is a more worrisome issue.

This will have a crowd effect. Much like vaccinations create crowd immunity, the crowd of good data can help protect the industry against bad data. But when less and less of that community has good data, bad data can cause outbreaks and significantly impact the overall system. And when users are forced to use bad data, without alternatives to switch to, there is no way to protect the global system.

ANNA, over the years in its own Annual Reports has called out the poor data quality and poor data consistency across its members. Yet, the narrative provided to the industry and regulators publicly is based wholly on the claim of CFI, ISIN and FISN being ‘standards’. There is an inference that just because it is a standard, it is “good,” regardless of the implementation. A standard without any rigor in its implementation is not much use, which is where we continue to find problems with those provided by ANNA globally.

For example, there is nothing wrong with XML or HTML as standards – but there are plenty examples of bad implementations and poorly constructed websites that do not work. Every reader likely has experienced being forced to use a poorly constructed web interface and was relieved when better alternatives became available.

Specifically, CFI implementation continues to be a significant issue, especially when regulators depend on this as a core data point and assume the data quality will be high.

More critically – this is a contagion being proliferated throughout a trusted system. Beyond regulatory reporting, firms that trust these mis-classifications to drive automation and processing will end up making potentially costly and systemically impactful errors. From tax reporting, ERISA restrictions, corporate actions, and more – this threatens investor funds and confidence.

Consumer, consensus driven data standards, that are able to compete fairly on merit and openness would instead drive higher quality. Standards, like the LEI, where competition reduces cost and drives up quality, or FIGI that has an open model allowing for an LOU-like Certified Provider network, provide more modern, open, and industry driven implementation solutions for standards.

Focusing on enabling interoperability instead of appointing any one entity (or group, or standard) as a de facto single provider would have significant benefits over the long term and provide a community based governance system. It is harder, yes, but most things done the right way are harder.

True interoperability that recognizes context and enables language translation between communities that utilize different meanings does not cause chaos or proliferation of bad data and information. It is trying to standardize away complexity that creates a false facade of reliability yet encourages chaos. The goal should be to organize complexity as best as possible through interoperable standards that are community patrolled and accepted, not imposed.

I look forward to working with the industry (through groups like ISITC), and with regulators globally, to further this conversation and enable better data practices globally that make our systems stronger and more reliable.

(all data examples as of April 15, 2019)

Richard Robinson

Richard Robinson

A senior business executive with more than 25 years of experience in the financial industry, with a rare perspective that spans working in operations and technology positions at global custodial banks, international brokers, investment managers as well as core industry utilities.