Skip to content

Why Fund Data Reconciliation Breaks at Scale — and How to Fix It

The three failure modes we see in every mid-market fund data stack

PS

Philipp Starovoytov

CTO & Co-Founder

7 min read

After 20+ fund data platform deployments, we've seen reconciliation break in every possible way. The breaks are almost always the same three things — different symptoms, same root causes.

Failure Mode 1: Schema Inconsistency Across Sources

Your prime broker calls it "long position." Your custodian calls it "held." Your OMS calls it "open." They're the same thing. Your reconciliation engine doesn't know that.

Most reconciliation tools assume the data sources agree on a canonical schema. Financial data sources never do. Bloomberg has its own field names. Goldman's prime brokerage file format differs from Morgan Stanley's. Citco's NAV extract isn't structured like SS&C's.

The fix: A canonical ontology layer that maps every source field to a single internal field name before any reconciliation logic runs. This sounds obvious. It's almost never done correctly.

The canonical layer needs to be version-controlled. When Bloomberg changes a field name in a file format update (and they will), your mapping breaks — not your downstream systems. You fix one file, not twelve.

Failure Mode 2: Timing Mismatch Without Temporal Awareness

Fund admin provides NAV at T+2. Prime broker provides positions at T+0. Custodian confirms at T+1. Your reconciliation runs at T+0.

If your reconciliation engine doesn't model temporal awareness — which record is the authoritative source for which point in time — you'll generate breaks that aren't breaks. Your team spends Tuesday reconciling a "break" that resolves itself on Wednesday when the fund admin report arrives.

The fix: Temporal tagging on every record at ingestion. Every data point carries its effective date and its source's publication timestamp. Reconciliation rules apply only within matching temporal windows.

Failure Mode 3: Break Routing That Hits Human Queues Too Early

When a break is detected, most systems fire a notification and wait for a human to investigate. This works fine when breaks are rare. It doesn't work when you're running a multi-strategy fund with 15 custodian accounts and 3 prime brokers.

At scale, most "breaks" are explainable by rule: wrong settlement date, currency conversion rounding, known timing mismatch between counterparties. These breaks should never reach a human queue. They should be auto-explained, logged, and cleared.

The fix: A break classification engine that runs before human escalation. Breaks are categorized: auto-resolved (rule match), pending (waiting for T+N data), or genuine (needs investigation). Only genuine breaks hit the human queue. Everything else is tracked and logged automatically.

What Properly-Reconciled Data Looks Like

When the three failure modes are resolved, reconciliation becomes boring — which is exactly what you want. The daily reconciliation run produces a report with:

  • Auto-resolved breaks: explained by rule, logged, cleared
  • Pending breaks: waiting for T+N counterparty data, expected resolution date
  • Genuine breaks: investigation required, assigned to named owner

The genuine break queue should be empty most days. When it's not, your team investigates one or two items, not fifteen.

That's the standard we set in every PLEXI deployment. It's achievable. It just requires doing the boring infrastructure work correctly.