XBRL Formula Validation — Why Your COREP Numbers Must Add Up Across Templates

XBRL validation with Arelle: two failure classes, six critical EBA cross-template rules, the golden summation rule, and the Python xbrl_valid.py module for COREP pipelines.

Bhanu

May 7, 2026

14 min read

📅 Day 12 of 18 · COREP Governance Pipeline Series · XBRL Validation

Day 11 produced a well-formed XBRL instance document that passes the ET.parse() check — the XML is valid. But XML validity and XBRL validity are entirely different things.

The EBA’s validation engine applies over 8,000 validation rules to a COREP submission. These rules check that your numbers are internally consistent — not just within a single template, but across all four templates simultaneously. The CET1 ratio in C 03.00 must equal CET1 capital from C 01.00 divided by RWA total from C 02.00. The total RWA row in C 02.00 must equal the sum of all exposure-class rows. The HQLA buffer in C 47.00 must equal Level 1 + Level 2A + Level 2B with the Level 2 cap applied.

These are not optional suggestions. They are EBA Validation Rules. A single failure means the NCA’s filing portal rejects the entire submission with a cryptic error code like v4789 and no further explanation. Your job is to ensure every one of these rules passes before the file ever leaves your pipeline.

This post implements xbrl_valid.py using Arelle’s validation API, explains the two classes of validation failure, maps the critical cross-template consistency rules to their source in the EBA taxonomy, and shows you how to read and fix every error code you will encounter.

1. The Two Classes of XBRL Validation Failure

Understanding the two classes is essential because they require completely different fixes.

Class	What it checks	Source in taxonomy	Fix lives in	Severity
Class 1 — Structural	XBRL syntax: required attributes, valid concept names, unit types, context references, namespace correctness	XSD schema files (`.xsd`)	`xbrl_gen.py` — the generator produced bad XML	Fatal — filing rejected immediately
Class 2 — Business Rule	Arithmetic consistency: summations, cross-template ratios, regulatory minimum floors, conditional rules	Calculation linkbase (`-cal.xml`) + EBA formula linkbase (`_val.xml`)	`dbt` mart models or source data — the numbers themselves are wrong	Blocking — submission rejected by NCA

⚠ Class 2 Failures Are Data Problems, Not Code Problems

When Arelle reports a Class 2 validation failure such as Calculation inconsistency: c0060 reported 4800 expected 4750, the fix is not in your XBRL generator. The generator faithfully wrote what was in your mart table. The problem is that your mart table’s total_rwa does not equal the sum of exposure-class RWAs in corep_c0200. The fix is in a dbt model — likely a rounding issue, a missing row, or a join that double-counts.

2. Validation Architecture

  Generated XBRL instance
  output/xbrl/COREP_2026-03-31_*.xbrl
            │
            ▼
  ┌─────────────────────────────────────────────────────────┐
  │              Arelle Validation Engine                   │
  │                                                         │
  │  Step 1: Load taxonomy (entry-point XSD)                │
  │  Step 2: Load instance document                         │
  │  Step 3: CLASS 1 — Structural validation                │
  │    • Concept names valid in namespace?                   │
  │    • Required attributes present (contextRef, unitRef)? │
  │    • Unit types match concept types?                     │
  │    • Context entity/period well-formed?                  │
  │  Step 4: CLASS 2 — Calculation validation               │
  │    • c0010 = c0020 + c0030 + c0040?  (C 01.00)         │
  │    • c0060_total = SUM(c0060_rows)?  (C 02.00)         │
  │    • c0050 = c0020_c0100 / c0060?    (cross-template)  │
  │    • c0090 = hqla_buffer / outflows? (C 47.00)         │
  │  Step 5: CLASS 2 — Formula linkbase (EBA business rules)│
  │    • v0001: cet1_ratio ≥ 0.045?                         │
  │    • v4789: tier1_ratio ≥ cet1_ratio?                   │
  │    • v5678: total_rwa > 0?                              │
  │    • v8823: lcr_ratio ≥ 1.0?                           │
  └─────────────────────────────────────────────────────────┘
            │
            ├─── ALL PASS → pipeline continues → Airflow marks success
            │
            └─── ANY FAIL → XbrlValidationError raised
                          → BranchPythonOperator routes to quarantine
                          → Arelle error report uploaded to MinIO
                          → audit.pipeline_run_log entry: status=FAIL

📌 Critical Cross-Template Consistency Rules

3. The Cross-Template Consistency Rules That Catch Most Failures

The most important validation rules are the ones that span multiple templates. These are impossible to catch in a single-table GX quality gate on Day 8, because they require comparing values across two separate mart tables. The XBRL validator is the first and only place where these are checked end-to-end.

3.1 The Capital Ratio Identity (C 01.00 ↔ C 02.00 ↔ C 03.00)

The Basel III capital ratio formula is:

$$\text{CET1 Ratio} = \frac{\text{CET1 Capital (C 01.00, r020)}}{\text{Total RWA (C 02.00, total row)}}$$

In your XBRL instance, Arelle checks that the value of concept c0050 (CET1 ratio, from C 03.00) equals this fraction computed from c0020 (C 01.00) and c0060 (C 02.00 total RWA). A discrepancy of even one unit in the reported thousands can trigger this rule.

EBA Validation Rule v0001 — CET1 Ratio Consistency

c0050 = ROUND(c0020 / c0060, 4) [where c0050 is from C 03.00, c0020 from C 01.00, c0060 from C 02.00]

Arelle error text: “Calculation inconsistency: reported c0050=0.112583 but computed c0020/c0060=0.112541”
Root cause: rounding in dbt corep_c0300.sql applied before the denominator total_rwa was finalized in corep_c0200.sql.

EBA Validation Rule v4789 — Tier 1 ≥ CET1 (cross-concept, same template)

c0040 (Tier 1 ratio) ≥ c0050 (CET1 ratio) [both in C 03.00]

Mathematical identity: CET1 ⊂ Tier 1. If this fails, there is a sign error or aggregation mistake in your int_capital_by_tier.sql intermediate model.

EBA Validation Rule v2314 — RWA Summation (within C 02.00)

c0060_TOTAL = SUM(c0060) over all non-TOTAL exposure_class rows

The TOTAL row in corep_c0200 must equal the arithmetic sum of all individual exposure-class rows. This is enforced by the calculation linkbase, not a formula rule. A rounding inconsistency of 1 (one thousand euros) triggers it.

EBA Validation Rule v5501 — Own Funds Summation (within C 01.00)

c0010 (Own Funds) = c0020 (CET1) + c0030 (AT1) + c0040 (T2)

Direct summation check from the calculation linkbase. This is the first rule that fails when your dbt mart aggregation has a gap or double-count.

EBA Validation Rule v8823 — LCR Ratio (C 47.00)

c0090 (LCR ratio) = ROUND(hqla_buffer / net_outflows, 4) AND c0090 ≥ 1.0

Two checks in one: arithmetic consistency of the ratio AND the regulatory minimum floor of 100% (LCR ≥ 1.0 in decimal). Delegated Regulation 2015/61 Article 4.

EBA Validation Rule v3301 — Level 2 HQLA Cap (C 47.00)

hqla_level2a + hqla_level2b ≤ 0.4 × hqla_buffer

Del. Reg. 2015/61 Article 17: Level 2 assets cannot exceed 40% of the total HQLA buffer after haircuts. Your int_lcr_hqla.sql must enforce this cap. If the synthetic data generates a case where Level 2 exceeds 40%, the rule triggers.

EBA Validation Rule v6102 — Leverage Ratio Consistency (C 03.00)

leverage_ratio = ROUND(tier1_capital / total_exposure_measure, 4) AND leverage_ratio ≥ 0.03

CRR2 Article 429. The leverage ratio denominator (total exposure measure) is not the same as RWA — it includes off-balance-sheet items. In your simplified pipeline the denominator uses total assets as a proxy. Ensure corep_c0300.sql uses a consistent source for both numerator and denominator.

4. The Rounding Problem — How One Unit of Error Breaks Seven Rules

The most common source of cross-template validation failures is rounding inconsistency between dbt models. Here is the exact failure chain:

  In mart.corep_c0200 (RWA by exposure class):
  exposure_class='corporates'   rwa = 1,234,567.4  →  rounded to  1,234,567
  exposure_class='retail'       rwa = 2,345,678.6  →  rounded to  2,345,679
  exposure_class='real_estate'  rwa = 987,654.3    →  rounded to    987,654
  SUM of rows                              =  4,567,900  (correct sum of rounded rows)
  
  But in mart.corep_c0300 (Capital Ratios — total_rwa column):
  total_rwa comes from int_capital_by_tier which sums BEFORE rounding:
  total_rwa = ROUND(1,234,567.4 + 2,345,678.6 + 987,654.3, 0)
            = ROUND(4,567,900.3, 0) = 4,567,900

  # Looks identical! But what about a fourth exposure class?
  exposure_class='institutions'  rwa = 198,765.5  →  rounded to  198,766
  
  SUM of rows now = 4,567,900 + 198,766 = 4,766,666
  corep_c0300.total_rwa = ROUND(4,567,900.3 + 198,765.5, 0)
                         = ROUND(4,766,665.8, 0) = 4,766,666   (same!)

  # Still matches — because ROUND_HALF_UP and ROUND_HALF_EVEN behave the same
  # when the fractional part isn't exactly 0.5.
  # But add one more row with rwa=543,210.5:

  exposure_class='equity'  rwa = 543,210.5  →  ROUND_HALF_UP → 543,211
                                             →  ROUND_HALF_EVEN → 543,210

  # If dbt uses Python ROUND() (banker's rounding = ROUND_HALF_EVEN)
  # but your _format_value() in xbrl_gen.py uses ROUND_HALF_UP,
  # the XBRL file reports 543,211 but the SUM from C 02.00 rows = 543,210.
  # Difference: 1 (one thousand euros).
  # v2314 FAILS. v0001 MAY FAIL (ratio denominator is off by 1).

⚠ The Fix: Consistent Rounding Strategy Across All Models

Use one rounding strategy everywhere and apply it at the latest possible point — in the XBRL generator, not in the dbt mart models. Your dbt mart models should store full precision (NUMERIC(20,6)). The XBRL generator’s _format_value() method applies the terminal rounding once, using ROUND_HALF_UP consistently, immediately before writing the XBRL element. The summation then uses the already-rounded values — exactly what the calculation linkbase verification expects.

🔍 Using Arelle’s Python API for Validation

5. Arelle Validation API Deep Dive

Arelle exposes three validation modes. You need all three in sequence.

Validation mode	What it runs	Arelle API call
XBRL validation	Structural checks: schema conformance, required attributes, unit type correctness	`modelXbrl.modelManager.cntlr.run()` with validate=True
Calculation validation	Summation checks from the calculation linkbase: every parent concept = sum of children	`ValidateXbrlCalcs.validate(modelXbrl)`
Formula validation	EBA business rules from the formula linkbase: cross-template ratios, regulatory minimums, conditional rules	`ValidateXbrlDimensions.validate(modelXbrl)` + formula plugin

# xbrl/validate_instance.py — standalone validation function
# Used by both xbrl_valid.py module and manual testing

from arelle import Cntlr, ModelManager
from arelle.validate import ValidateXbrl
from arelle.ValidateXbrlCalcs import ValidateXbrlCalcs
import logging, os
from pathlib import Path
from dataclasses import dataclass, field
from typing import List

log = logging.getLogger(__name__)

TAXONOMY_ENTRY = os.environ.get(
    "EBA_TAXONOMY_ENTRY",
    "data/taxonomy/eba_3.3/www.eba.europa.eu/eu/fr/xbrl/crr/fws/corep/cor/2024-01-31/mod/corep-full-entry-point.xsd",
)

# Arelle message severity levels
SEVERITY_ERROR   = "ERROR"
SEVERITY_WARNING = "WARNING"
SEVERITY_INFO    = "INFO"


@dataclass
class ValidationMessage:
    severity:   str
    code:       str    # EBA rule code e.g. "xbrlCalcs:inconsistency"
    message:    str
    template:   str    # e.g. "C 01.00" — extracted from message if possible
    concept:    str    # e.g. "ei:c0010"
    value:      str    # reported value
    expected:   str    # computed expected value (for calc errors)


@dataclass
class ValidationResult:
    passed:       bool
    error_count:  int
    warn_count:   int
    messages:     List[ValidationMessage] = field(default_factory=list)

    @property
    def errors(self):
        return [m for m in self.messages if m.severity == SEVERITY_ERROR]

    @property
    def warnings(self):
        return [m for m in self.messages if m.severity == SEVERITY_WARNING]


def validate_xbrl_instance(instance_path: Path) -> ValidationResult:
    """
    Run full XBRL validation on an instance document:
      1. Structural (schema conformance)
      2. Calculation linkbase (summation consistency)
      3. Formula linkbase (EBA business rules — requires formula plugin)

    Returns ValidationResult with all messages categorised.
    """
    messages: List[ValidationMessage] = []

    # ── Custom log handler to capture Arelle messages ─────────────
    class _ArelleLogHandler(logging.Handler):
        def emit(self, record):
            msg_text  = self.format(record)
            severity  = SEVERITY_ERROR   if record.levelno >= 40 else \
                        SEVERITY_WARNING if record.levelno >= 30 else \
                        SEVERITY_INFO
            code = getattr(record, "messageCode", "")
            messages.append(ValidationMessage(
                severity=severity,
                code=code,
                message=msg_text,
                template=_extract_template(msg_text),
                concept=_extract_concept(msg_text),
                value=_extract_value(msg_text, "reported"),
                expected=_extract_value(msg_text, "expected"),
            ))

    log_handler = _ArelleLogHandler()
    arelle_logger = logging.getLogger("arelle")
    arelle_logger.addHandler(log_handler)

    # ── Run Arelle ────────────────────────────────────────────────
    cntlr    = Cntlr.Cntlr()
    modelMgr = ModelManager.ModelManager(cntlr)

    try:
        log.info("[xbrl_valid] Loading taxonomy...")
        modelXbrl = modelMgr.load(TAXONOMY_ENTRY)

        log.info("[xbrl_valid] Loading instance: %s", instance_path)
        instance  = modelMgr.load(str(instance_path), modelXbrl=modelXbrl)

        # Step 1: Structural XBRL validation
        log.info("[xbrl_valid] Running structural validation...")
        ValidateXbrl.ValidateXbrl(modelMgr).validate(instance)

        # Step 2: Calculation linkbase validation
        log.info("[xbrl_valid] Running calculation validation...")
        ValidateXbrlCalcs(instance).validate()

        # Step 3: Formula linkbase (EBA business rules)
        # Requires Arelle formula plugin — loaded via cntlr if installed
        try:
            from arelle.plugin import formulaXbrl
            formulaXbrl.run(instance)
            log.info("[xbrl_valid] Formula validation complete.")
        except ImportError:
            log.warning("[xbrl_valid] Formula plugin not available — skipping EBA business rules.")

    finally:
        modelMgr.close()
        cntlr.close()
        arelle_logger.removeHandler(log_handler)

    error_count = sum(1 for m in messages if m.severity == SEVERITY_ERROR)
    warn_count  = sum(1 for m in messages if m.severity == SEVERITY_WARNING)

    return ValidationResult(
        passed=error_count == 0,
        error_count=error_count,
        warn_count=warn_count,
        messages=messages,
    )


def _extract_template(msg: str) -> str:
    import re
    m = re.search(rr"C \d{2}\.\d{2}", msg)
    return m.group(0) if m else ""

def _extract_concept(msg: str) -> str:
    import re
    m = re.search(rr"(?:ei:|c)\d{4}", msg)
    return m.group(0) if m else ""

def _extract_value(msg: str, label: str) -> str:
    import re
    m = re.search(rfr"{label}[:\s=]+([0-9.\-]+)", msg, re.IGNORECASE)
    return m.group(1) if m else ""

6. The `xbrl_valid.py` Module

"""
modules/xbrl_valid.py — Validate the generated XBRL instance against the EBA taxonomy.

Runs three validation passes:
  1. Structural (schema conformance)
  2. Calculation linkbase (summation consistency)
  3. Formula linkbase (EBA business rules — v-codes)

On any ERROR-level finding, raises XbrlValidationError.
Warnings are logged but do not halt the pipeline.
Validation report is uploaded to MinIO as audit evidence.
"""

import json, logging, os
from datetime import datetime, timezone
from pathlib import Path

from modules.base import BaseModule
from xbrl.validate_instance import validate_xbrl_instance, ValidationResult

log = logging.getLogger(__name__)

XBRL_OUTPUT_DIR = Path(os.environ.get("XBRL_OUTPUT_DIR", "output/xbrl"))
REPORT_DIR      = Path(os.environ.get("XBRL_REPORT_DIR", "output/validation_reports"))


class XbrlValidationError(RuntimeError):
    """Raised when the XBRL instance fails EBA validation rules."""
    pass


class XbrlValidModule(BaseModule):
    MODULE_NAME = "xbrl_valid"

    def input_check(self) -> None:
        """Verify at least one XBRL file exists in the output directory."""
        xbrl_files = list(XBRL_OUTPUT_DIR.glob("*.xbrl"))
        if not xbrl_files:
            raise RuntimeError(
                f"[xbrl_valid] No .xbrl files found in {XBRL_OUTPUT_DIR}. "
                "Run xbrl_gen first: python pipeline.py --module xbrl_gen"
            )
        # Validate the most recently generated file
        self._instance_path = max(xbrl_files, key=lambda p: p.stat().st_mtime)
        log.info("[xbrl_valid] Target instance: %s", self._instance_path)
        REPORT_DIR.mkdir(parents=True, exist_ok=True)

    def _execute(self) -> None:
        """Run full Arelle validation and raise on any ERROR-level finding."""
        log.info("[xbrl_valid] Starting validation: %s", self._instance_path)
        result: ValidationResult = validate_xbrl_instance(self._instance_path)

        # ── Log summary ──────────────────────────────────────────────
        log.info(
            "[xbrl_valid] Validation complete: %s | errors=%d | warnings=%d",
            "PASS" if result.passed else "FAIL",
            result.error_count,
            result.warn_count,
        )

        # ── Log individual errors ─────────────────────────────────────
        for msg in result.errors:
            log.error(
                "[xbrl_valid] ERROR [%s] template=%s concept=%s | %s",
                msg.code, msg.template, msg.concept, msg.message
            )

        # ── Log warnings (informational — do not halt) ────────────────
        for msg in result.warnings:
            log.warning(
                "[xbrl_valid] WARN  [%s] %s", msg.code, msg.message
            )

        # ── Write JSON validation report ──────────────────────────────
        ts          = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
        report_path = REPORT_DIR / f"validation_report_{ts}.json"
        report_data = {
            "instance":     str(self._instance_path),
            "validated_at": ts,
            "passed":       result.passed,
            "error_count":  result.error_count,
            "warn_count":   result.warn_count,
            "errors": [
                {"code": m.code, "template": m.template,
                 "concept": m.concept, "reported": m.value,
                 "expected": m.expected, "message": m.message}
                for m in result.errors
            ],
            "warnings": [
                {"code": m.code, "message": m.message}
                for m in result.warnings
            ],
        }
        report_path.write_text(json.dumps(report_data, indent=2))
        log.info("[xbrl_valid] Validation report written: %s", report_path)
        self._report_path = report_path

        self._upload_report_to_minio(report_path)
        self._write_audit(report_data)

        # ── Raise on errors ───────────────────────────────────────────
        if not result.passed:
            error_summary = "; ".join(
                f"{m.code}({m.template})" for m in result.errors[:5]
            )
            raise XbrlValidationError(
                f"[xbrl_valid] {result.error_count} EBA validation error(s): {error_summary}. "
                f"See report: {report_path}"
            )

    def _upload_report_to_minio(self, report_path: Path) -> None:
        try:
            from minio import Minio
            client = Minio(
                os.environ.get("MINIO_ENDPOINT", "minio:9000"),
                access_key=os.environ.get("MINIO_ROOT_USER",     "minioadmin"),
                secret_key=os.environ.get("MINIO_ROOT_PASSWORD", "minioadmin"),
                secure=False,
            )
            bucket = "corep-xbrl-output"
            if not client.bucket_exists(bucket):
                client.make_bucket(bucket)
            client.fput_object(
                bucket,
                f"validation_reports/{report_path.name}",
                str(report_path),
                content_type="application/json",
            )
            log.info("[xbrl_valid] Report uploaded → minio://%s/validation_reports/%s", bucket, report_path.name)
        except Exception as exc:
            log.warning("[xbrl_valid] MinIO upload failed (non-fatal): %s", exc)

    def _write_audit(self, report_data: dict) -> None:
        import json
        from modules.base import _pg_conn
        conn = _pg_conn()
        try:
            cur = conn.cursor()
            cur.execute(
                """
                INSERT INTO audit.pipeline_run_log
                    (run_id, module_name, status, metadata, ran_at)
                VALUES (%s, 'xbrl_valid', %s, %s, now())
                """,
                (
                    self._run_id,
                    "PASS" if report_data["passed"] else "FAIL",
                    json.dumps({
                        "error_count": report_data["error_count"],
                        "warn_count":  report_data["warn_count"],
                        "instance":    report_data["instance"],
                        "report":      str(self._report_path),
                    }),
                ),
            )
            conn.commit()
        finally:
            conn.close()

    def emit_lineage(self) -> None:
        # Validation is read-only — input is the XBRL file, no new data written
        log.info("[xbrl_valid] No lineage event — validation is read-only.")

    def output_check(self) -> None:
        """Verify the validation report JSON was written successfully."""
        report_path = getattr(self, "_report_path", None)
        if not report_path or not Path(report_path).exists():
            raise RuntimeError("[xbrl_valid] Validation report file was not written.")
        data = json.loads(Path(report_path).read_text())
        if not data.get("passed"):
            raise RuntimeError(
                f"[xbrl_valid] output_check: validation report shows FAIL with {data.get('error_count')} error(s)."
            )
        log.info("[xbrl_valid] output_check: PASS — %s", report_path)

📊 Reading and Interpreting Arelle Validation Output

7. Reading Arelle Error Messages

Arelle’s raw error messages are dense. Here is how to decode the most common ones:

# Run validation manually (outside the pipeline module)
python pipeline.py --module xbrl_valid

# ── PASS output ──────────────────────────────────────────────────────
INFO  [xbrl_valid] Loading taxonomy...
INFO  [xbrl_valid] Loading instance: output/xbrl/COREP_2026-03-31_20260507T081433Z.xbrl
INFO  [xbrl_valid] Running structural validation...
INFO  [xbrl_valid] Running calculation validation...
INFO  [xbrl_valid] Formula validation complete.
INFO  [xbrl_valid] Validation complete: PASS | errors=0 | warnings=2
WARN  [xbrl_valid] WARN  [xbrlCalcs:insignificantRounding] C 02.00 ei:c0060 ...
INFO  [xbrl_valid] Validation report written: output/validation_reports/validation_report_20260507T081445Z.json
INFO  [xbrl_valid] output_check: PASS

# ── FAIL output — calculation inconsistency ──────────────────────────
ERROR [xbrl_valid] ERROR [xbrlCalcs:inconsistency]
      template=C 01.00 concept=ei:c0010
      Calculation inconsistency in {http://www.eba.europa.eu/xbrl/crr/dict/con}c0010
      reported sum 902000 computed sum 900000
      Difference: 2000 (0.22%)
      Context: C_2026-03-31_instant
# Fix: c0010 in your instance = 902000 but c0020+c0030+c0040 = 900000.
# Check corep_c0100.sql — own_funds column must be the arithmetic sum
# of cet1_capital + at1_capital + t2_capital, not independently computed.

# ── FAIL output — EBA formula rule ───────────────────────────────────
ERROR [xbrl_valid] ERROR [EBA.v4789]
      template=C 03.00
      Formula assertion failed: {ei}c0040 >= {ei}c0050
      tier1_ratio (c0040) = 0.089000 but cet1_ratio (c0050) = 0.112583
# Fix: tier1_ratio < cet1_ratio — mathematically impossible.
# Root cause: corep_c0300.sql reads tier1_ratio from int_capital_by_tier
# which uses a different denominator (total_assets) than cet1_ratio
# which uses total_rwa. Align both ratios to use total_rwa as denominator.

8. EBA Validation Rule Code Reference

Arelle code	EBA rule	Template	What it checks	Typical root cause
`xbrlCalcs:inconsistency`	Calculation linkbase	Any	Parent concept ≠ sum of children	Rounding inconsistency between mart model and individual rows
`xbrlCalcs:insignificantRounding`	Calculation linkbase	Any	Difference ≤ 0.5 (within rounding tolerance)	Warning only — acceptable
`EBA.v0001`	v0001	C 03.00	CET1 ratio = CET1 / RWA	Different RWA denominator in C 03.00 vs C 02.00 total row
`EBA.v4789`	v4789	C 03.00	Tier 1 ratio ≥ CET1 ratio	Incorrect tier assignment in staging — AT1 counted in CET1
`EBA.v5501`	v5501	C 01.00	Own Funds = CET1 + AT1 + T2	Gap in capital_instruments source data — missing tier rows
`EBA.v2314`	v2314	C 02.00	Total RWA = sum of exposure class RWAs	TOTAL row computed separately from exposure-class rows
`EBA.v3301`	v3301	C 47.00	Level 2 HQLA ≤ 40% of total buffer	Synthetic data has too many Level 2 assets — cap not applied in dbt
`EBA.v8823`	v8823	C 47.00	LCR ratio ≥ 1.0 AND ratio = HQLA / outflows	Stressed outflow calculation wrong in `int_lcr_outflows.sql`
`EBA.v6102`	v6102	C 03.00	Leverage ratio = Tier 1 / exposure measure	Exposure measure not consistent with Tier 1 capital denominator
`xbrl:schemaImportMissing`	Structural	Any	Taxonomy entry-point schema not reachable	Taxonomy path wrong or taxonomy files not downloaded
`xbrl:elementNotInSubstitutionGroup`	Structural	Any	Concept not valid in the XBRL substitution group	Wrong EBA taxonomy version — concept IDs changed between versions

✓ Systematic Fix Strategy for Each Error Class

9. Systematic Fix Strategy

9.1 Fixing Calculation Inconsistencies (`xbrlCalcs:inconsistency`)

-- Step 1: Find the discrepancy in Trino
-- Compare C 01.00 own_funds against manual sum
SELECT
    own_funds                              AS reported_c0010,
    cet1_capital + at1_capital + t2_capital AS computed_sum,
    own_funds - (cet1_capital + at1_capital + t2_capital) AS diff
FROM mart.corep_c0100;
-- If diff != 0 → fix corep_c0100.sql
-- The own_funds column must be: cet1_capital + at1_capital + t2_capital
-- NOT independently summed from raw.capital_instruments

-- Step 2: Compare C 02.00 row sum against C 03.00 total_rwa
SELECT
    c0200_total.rwa   AS c0200_total_rwa,
    c0300.total_rwa   AS c0300_total_rwa,
    c0200_total.rwa - c0300.total_rwa AS diff
FROM (
    SELECT SUM(rwa) AS rwa
    FROM mart.corep_c0200
    WHERE exposure_class != 'TOTAL'
) c0200_total
CROSS JOIN mart.corep_c0300 c0300;
-- If diff != 0 → fix: c0300.total_rwa must reference int_rwa_by_exposure_class
-- which is the same source as c0200 rows. Use a REF() in dbt, not a duplicate calculation.

9.2 The Golden Rule for Calculation-Clean XBRL

✓ Golden Rule: Never Compute a Parent Independently of Its Children

In your dbt mart models, every parent concept must be derived from its children using a SUM — never independently computed. Concretely:

corep_c0100.own_funds = cet1_capital + at1_capital + t2_capital (never re-sum from raw)
corep_c0200.rwa WHERE exposure_class='TOTAL' = SUM(rwa) FROM corep_c0200 WHERE exposure_class != 'TOTAL'
corep_c0300.total_rwa = reference to the same CTE that produces corep_c0200‘s total row
corep_c4700.hqla_buffer = hqla_level1 + hqla_level2a_adjusted + hqla_level2b_adjusted

The calculation linkbase check is arithmetic. The only way to guarantee it passes is to make the parent a deterministic function of the children in the same SQL query.

9.3 Fixing Cross-Template Ratio Failures

-- Fix for EBA.v0001: CET1 ratio must use the EXACT same RWA total as C 02.00
-- In corep_c0300.sql, replace any independent total_rwa calculation with:

WITH rwa_total AS (
    -- This CTE must be IDENTICAL to the TOTAL row computation in corep_c0200.sql
    SELECT SUM(rwa) AS total_rwa
    FROM {{ ref('int_rwa_by_exposure_class') }}
),
capital AS (
    SELECT cet1_capital, at1_capital, t2_capital, own_funds
    FROM {{ ref('corep_c0100') }}
)
SELECT
    capital.own_funds,
    capital.cet1_capital,
    rwa_total.total_rwa,
    -- Ratios: use NULLIF to prevent division by zero, ROUND to 6dp
    ROUND(capital.cet1_capital::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS cet1_ratio,
    ROUND((capital.cet1_capital + capital.at1_capital)::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS tier1_ratio,
    ROUND(capital.own_funds::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS total_capital_ratio
FROM rwa_total
CROSS JOIN capital;

-- Key points:
-- 1. rwa_total CTE uses int_rwa_by_exposure_class — the SAME model as corep_c0200
-- 2. No ROUND() on intermediate values — only terminal ROUND on the ratio
-- 3. NULLIF(rwa_total, 0) prevents divide-by-zero producing NULL instead of error
-- 4. tier1_ratio = (cet1 + at1) / rwa — not independently summed from instruments

10. Integrating Validation into the Airflow DAG

# dags/corep_pipeline_dag.py — validation branch logic

def _xbrl_validation_branch(**context) -> str:
    """Return next task based on XBRL validation result stored in XCom."""
    validation_status = context["task_instance"].xcom_pull(
        task_ids="run_xbrl_validation", key="validation_status"
    )
    if validation_status == "PASS":
        return "prepare_submission_package"
    return "quarantine_failed_xbrl"   # archives the XBRL + report to MinIO/quarantine/


def _run_xbrl_validation(**context) -> None:
    """Run xbrl_valid module and push status to XCom."""
    from modules.xbrl_valid import XbrlValidModule, XbrlValidationError
    ti = context["task_instance"]
    try:
        mod = XbrlValidModule()
        mod.run()
        ti.xcom_push(key="validation_status", value="PASS")
    except XbrlValidationError as exc:
        log.error("XBRL validation FAILED: %s", exc)
        ti.xcom_push(key="validation_status", value="FAIL")
        # Do not re-raise — let BranchPythonOperator handle routing


run_xbrl_validation = PythonOperator(
    task_id="run_xbrl_validation",
    python_callable=_run_xbrl_validation,
)

branch_on_xbrl_validation = BranchPythonOperator(
    task_id="branch_on_xbrl_validation",
    python_callable=_xbrl_validation_branch,
)

# Full pipeline DAG chain at this point:
(
    run_ingest
    >> run_quality_layer1
    >> run_dbt_staging >> run_dbt_intermediate >> run_dbt_mart
    >> run_quality_layer2
    >> run_catalog
    >> run_security
    >> run_xbrl_gen
    >> run_xbrl_validation         # ← Day 12
    >> branch_on_xbrl_validation
)
branch_on_xbrl_validation >> prepare_submission_package  # Day 13
branch_on_xbrl_validation >> quarantine_failed_xbrl

11. What “Passes Validation” Actually Means to the Regulator

Validation check passed	What it proves	Regulatory significance
All structural checks	XBRL file is technically well-formed and uses valid EBA concepts	NCA filing portal accepts the file for processing
Calculation linkbase	Every parent concept equals the sum of its children — no arithmetic gaps	Internal consistency: your own funds components add up to total own funds
Cross-template ratio rules (v0001, v4789)	Capital ratios in C 03.00 are arithmetically consistent with the capital components in C 01.00 and C 02.00	Supervisory comparability: the ratios the ECB monitors are derived from the correct inputs
Regulatory floor rules (v8823, v6102)	Reported LCR ≥ 100%, leverage ratio ≥ 3%	Confirms the bank meets minimum requirements — or the submission accurately reports a breach that triggers supervisory action
Level 2 cap rule (v3301)	HQLA composition respects the Del. Reg. 2015/61 concentration limits	Ensures the liquidity buffer is genuine — not loaded with lower-quality assets

🔒 Validation Is Not Optional Even for Internal Reporting

Even if your organisation uses the XBRL output for internal management reporting rather than direct NCA submission (because you go through a third-party filing agent), run Arelle validation. The validation rules encode the EBA’s understanding of what the numbers mean. A rule failure is a data quality finding — the same one your filing agent will catch, but you will catch it three weeks earlier at zero cost.

📚 Day 12 Key Takeaways

Two classes of failure — structural errors mean your generator is broken; business rule errors mean your mart data is wrong. They require fixes in completely different places.
Cross-template rules are the hardest to debug — EBA.v0001 fails because C 03.00 and C 02.00 use slightly different RWA computations. The fix is to make C 03.00’s total_rwa a direct reference to C 02.00’s source model, never an independent calculation.
The Golden Rule: never compute a parent independently of its children. Every summation parent in XBRL must be the arithmetic sum of its children in the same dbt model, not re-derived from a raw source.
Rounding strategy must be consistent — choose one rounding mode (ROUND_HALF_UP) and apply it only once, at the terminal XBRL generation step. Never round in intermediate dbt models if the rounded value will be summed again.
The validation report is audit evidence — a JSON file timestamped before submission, uploaded to MinIO, that proves EBA validation was run and passed. It is the technical counterpart to the GX data docs from Day 8.
Warnings are not errors — xbrlCalcs:insignificantRounding warnings are acceptable when the difference is within ±0.5 of the reported unit. They do not cause NCA rejection.
Next: Day 13 — Building the COREP submission package: bundling the XBRL instance, validation report, and lineage evidence into a submission-ready archive with a covering note.

Published: May 07, 2026

Updated: May 07, 2026

Cracking the EBA XBRL Taxonomy with Arelle — a Python Walkthrough

May 7, 2026

12 min read

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

May 7, 2026

13 min read

Add a comment

Testing a Regulatory Pipeline — Bad Data, Bad Roles, and What Should Fail

May 8, 2026

12 min read

End-to-End Data Lineage for COREP — Drilling from an XBRL Fact Back to the Source Column

May 7, 2026

12 min read

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

May 7, 2026

13 min read

XBRL Formula Validation — Why Your COREP Numbers Must Add Up Across Templates

1. The Two Classes of XBRL Validation Failure

2. Validation Architecture

3. The Cross-Template Consistency Rules That Catch Most Failures

3.1 The Capital Ratio Identity (C 01.00 ↔ C 02.00 ↔ C 03.00)

4. The Rounding Problem — How One Unit of Error Breaks Seven Rules

5. Arelle Validation API Deep Dive

6. The `xbrl_valid.py` Module

7. Reading Arelle Error Messages

8. EBA Validation Rule Code Reference

9. Systematic Fix Strategy

9.1 Fixing Calculation Inconsistencies (`xbrlCalcs:inconsistency`)

9.2 The Golden Rule for Calculation-Clean XBRL

9.3 Fixing Cross-Template Ratio Failures

10. Integrating Validation into the Airflow DAG

11. What “Passes Validation” Actually Means to the Regulator

📚 Day 12 Key Takeaways

Cracking the EBA XBRL Taxonomy with Arelle — a Python Walkthrough

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

Leave a Reply Cancel reply

You May Be Interested

Testing a Regulatory Pipeline — Bad Data, Bad Roles, and What Should Fail

End-to-End Data Lineage for COREP — Drilling from an XBRL Fact Back to the Source Column

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

Production-Ready GKE: The Complete Best Practices Guide for Enterprise Kubernetes Deployments

Production-Ready GKE: The Complete Best Practices Guide for Enterprise Kubernetes Deployments

Production-Ready GKE: The Complete Best Practices Guide for Enterprise Kubernetes Deployments

Production-Ready GKE: The Complete Best Practices Guide for Enterprise Kubernetes Deployments

Production-Ready EKS: The Complete Best Practices Guide for Enterprise Kubernetes Deployments

XBRL Formula Validation — Why Your COREP Numbers Must Add Up Across Templates

1. The Two Classes of XBRL Validation Failure

2. Validation Architecture

3. The Cross-Template Consistency Rules That Catch Most Failures

3.1 The Capital Ratio Identity (C 01.00 ↔ C 02.00 ↔ C 03.00)

4. The Rounding Problem — How One Unit of Error Breaks Seven Rules

5. Arelle Validation API Deep Dive

6. The xbrl_valid.py Module

7. Reading Arelle Error Messages

8. EBA Validation Rule Code Reference

9. Systematic Fix Strategy

9.1 Fixing Calculation Inconsistencies (xbrlCalcs:inconsistency)

9.2 The Golden Rule for Calculation-Clean XBRL

9.3 Fixing Cross-Template Ratio Failures

10. Integrating Validation into the Airflow DAG

11. What “Passes Validation” Actually Means to the Regulator

📚 Day 12 Key Takeaways

Cracking the EBA XBRL Taxonomy with Arelle — a Python Walkthrough

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

Leave a Reply Cancel reply

You May Be Interested

Testing a Regulatory Pipeline — Bad Data, Bad Roles, and What Should Fail

End-to-End Data Lineage for COREP — Drilling from an XBRL Fact Back to the Source Column

Orchestrating a Regulatory Reporting Pipeline with Apache Airflow

6. The `xbrl_valid.py` Module

9.1 Fixing Calculation Inconsistencies (`xbrlCalcs:inconsistency`)