Day 11 produced a well-formed XBRL instance document that passes the ET.parse() check — the XML is valid. But XML validity and XBRL validity are entirely different things.
The EBA’s validation engine applies over 8,000 validation rules to a COREP submission. These rules check that your numbers are internally consistent — not just within a single template, but across all four templates simultaneously. The CET1 ratio in C 03.00 must equal CET1 capital from C 01.00 divided by RWA total from C 02.00. The total RWA row in C 02.00 must equal the sum of all exposure-class rows. The HQLA buffer in C 47.00 must equal Level 1 + Level 2A + Level 2B with the Level 2 cap applied.
These are not optional suggestions. They are EBA Validation Rules. A single failure means the NCA’s filing portal rejects the entire submission with a cryptic error code like v4789 and no further explanation. Your job is to ensure every one of these rules passes before the file ever leaves your pipeline.
This post implements xbrl_valid.py using Arelle’s validation API, explains the two classes of validation failure, maps the critical cross-template consistency rules to their source in the EBA taxonomy, and shows you how to read and fix every error code you will encounter.
1. The Two Classes of XBRL Validation Failure
Understanding the two classes is essential because they require completely different fixes.
| Class | What it checks | Source in taxonomy | Fix lives in | Severity |
|---|---|---|---|---|
| Class 1 — Structural | XBRL syntax: required attributes, valid concept names, unit types, context references, namespace correctness | XSD schema files (.xsd) | xbrl_gen.py — the generator produced bad XML | Fatal — filing rejected immediately |
| Class 2 — Business Rule | Arithmetic consistency: summations, cross-template ratios, regulatory minimum floors, conditional rules | Calculation linkbase (-cal.xml) + EBA formula linkbase (_val.xml) | dbt mart models or source data — the numbers themselves are wrong | Blocking — submission rejected by NCA |
When Arelle reports a Class 2 validation failure such as Calculation inconsistency: c0060 reported 4800 expected 4750, the fix is not in your XBRL generator. The generator faithfully wrote what was in your mart table. The problem is that your mart table’s total_rwa does not equal the sum of exposure-class RWAs in corep_c0200. The fix is in a dbt model — likely a rounding issue, a missing row, or a join that double-counts.
2. Validation Architecture
Generated XBRL instance output/xbrl/COREP_2026-03-31_*.xbrl │ ▼ ┌─────────────────────────────────────────────────────────┐ │ Arelle Validation Engine │ │ │ │ Step 1: Load taxonomy (entry-point XSD) │ │ Step 2: Load instance document │ │ Step 3: CLASS 1 — Structural validation │ │ • Concept names valid in namespace? │ │ • Required attributes present (contextRef, unitRef)? │ │ • Unit types match concept types? │ │ • Context entity/period well-formed? │ │ Step 4: CLASS 2 — Calculation validation │ │ • c0010 = c0020 + c0030 + c0040? (C 01.00) │ │ • c0060_total = SUM(c0060_rows)? (C 02.00) │ │ • c0050 = c0020_c0100 / c0060? (cross-template) │ │ • c0090 = hqla_buffer / outflows? (C 47.00) │ │ Step 5: CLASS 2 — Formula linkbase (EBA business rules)│ │ • v0001: cet1_ratio ≥ 0.045? │ │ • v4789: tier1_ratio ≥ cet1_ratio? │ │ • v5678: total_rwa > 0? │ │ • v8823: lcr_ratio ≥ 1.0? │ └─────────────────────────────────────────────────────────┘ │ ├─── ALL PASS → pipeline continues → Airflow marks success │ └─── ANY FAIL → XbrlValidationError raised → BranchPythonOperator routes to quarantine → Arelle error report uploaded to MinIO → audit.pipeline_run_log entry: status=FAIL
3. The Cross-Template Consistency Rules That Catch Most Failures
The most important validation rules are the ones that span multiple templates. These are impossible to catch in a single-table GX quality gate on Day 8, because they require comparing values across two separate mart tables. The XBRL validator is the first and only place where these are checked end-to-end.
3.1 The Capital Ratio Identity (C 01.00 ↔ C 02.00 ↔ C 03.00)
The Basel III capital ratio formula is:
$$\text{CET1 Ratio} = \frac{\text{CET1 Capital (C 01.00, r020)}}{\text{Total RWA (C 02.00, total row)}}$$In your XBRL instance, Arelle checks that the value of concept c0050 (CET1 ratio, from C 03.00) equals this fraction computed from c0020 (C 01.00) and c0060 (C 02.00 total RWA). A discrepancy of even one unit in the reported thousands can trigger this rule.
Root cause: rounding in dbt
corep_c0300.sql applied before the denominator total_rwa was finalized in corep_c0200.sql.int_capital_by_tier.sql intermediate model.corep_c0200 must equal the arithmetic sum of all individual exposure-class rows. This is enforced by the calculation linkbase, not a formula rule. A rounding inconsistency of 1 (one thousand euros) triggers it.int_lcr_hqla.sql must enforce this cap. If the synthetic data generates a case where Level 2 exceeds 40%, the rule triggers.corep_c0300.sql uses a consistent source for both numerator and denominator.4. The Rounding Problem — How One Unit of Error Breaks Seven Rules
The most common source of cross-template validation failures is rounding inconsistency between dbt models. Here is the exact failure chain:
In mart.corep_c0200 (RWA by exposure class): exposure_class='corporates' rwa = 1,234,567.4 → rounded to 1,234,567 exposure_class='retail' rwa = 2,345,678.6 → rounded to 2,345,679 exposure_class='real_estate' rwa = 987,654.3 → rounded to 987,654 SUM of rows = 4,567,900 (correct sum of rounded rows) But in mart.corep_c0300 (Capital Ratios — total_rwa column): total_rwa comes from int_capital_by_tier which sums BEFORE rounding: total_rwa = ROUND(1,234,567.4 + 2,345,678.6 + 987,654.3, 0) = ROUND(4,567,900.3, 0) = 4,567,900 # Looks identical! But what about a fourth exposure class? exposure_class='institutions' rwa = 198,765.5 → rounded to 198,766 SUM of rows now = 4,567,900 + 198,766 = 4,766,666 corep_c0300.total_rwa = ROUND(4,567,900.3 + 198,765.5, 0) = ROUND(4,766,665.8, 0) = 4,766,666 (same!) # Still matches — because ROUND_HALF_UP and ROUND_HALF_EVEN behave the same # when the fractional part isn't exactly 0.5. # But add one more row with rwa=543,210.5: exposure_class='equity' rwa = 543,210.5 → ROUND_HALF_UP → 543,211 → ROUND_HALF_EVEN → 543,210 # If dbt uses Python ROUND() (banker's rounding = ROUND_HALF_EVEN) # but your _format_value() in xbrl_gen.py uses ROUND_HALF_UP, # the XBRL file reports 543,211 but the SUM from C 02.00 rows = 543,210. # Difference: 1 (one thousand euros). # v2314 FAILS. v0001 MAY FAIL (ratio denominator is off by 1).
Use one rounding strategy everywhere and apply it at the latest possible point — in the XBRL generator, not in the dbt mart models. Your dbt mart models should store full precision (NUMERIC(20,6)). The XBRL generator’s _format_value() method applies the terminal rounding once, using ROUND_HALF_UP consistently, immediately before writing the XBRL element. The summation then uses the already-rounded values — exactly what the calculation linkbase verification expects.
5. Arelle Validation API Deep Dive
Arelle exposes three validation modes. You need all three in sequence.
| Validation mode | What it runs | Arelle API call |
|---|---|---|
| XBRL validation | Structural checks: schema conformance, required attributes, unit type correctness | modelXbrl.modelManager.cntlr.run() with validate=True |
| Calculation validation | Summation checks from the calculation linkbase: every parent concept = sum of children | ValidateXbrlCalcs.validate(modelXbrl) |
| Formula validation | EBA business rules from the formula linkbase: cross-template ratios, regulatory minimums, conditional rules | ValidateXbrlDimensions.validate(modelXbrl) + formula plugin |
# xbrl/validate_instance.py — standalone validation function # Used by both xbrl_valid.py module and manual testing from arelle import Cntlr, ModelManager from arelle.validate import ValidateXbrl from arelle.ValidateXbrlCalcs import ValidateXbrlCalcs import logging, os from pathlib import Path from dataclasses import dataclass, field from typing import List log = logging.getLogger(__name__) TAXONOMY_ENTRY = os.environ.get( "EBA_TAXONOMY_ENTRY", "data/taxonomy/eba_3.3/www.eba.europa.eu/eu/fr/xbrl/crr/fws/corep/cor/2024-01-31/mod/corep-full-entry-point.xsd", ) # Arelle message severity levels SEVERITY_ERROR = "ERROR" SEVERITY_WARNING = "WARNING" SEVERITY_INFO = "INFO" @dataclass class ValidationMessage: severity: str code: str # EBA rule code e.g. "xbrlCalcs:inconsistency" message: str template: str # e.g. "C 01.00" — extracted from message if possible concept: str # e.g. "ei:c0010" value: str # reported value expected: str # computed expected value (for calc errors) @dataclass class ValidationResult: passed: bool error_count: int warn_count: int messages: List[ValidationMessage] = field(default_factory=list) @property def errors(self): return [m for m in self.messages if m.severity == SEVERITY_ERROR] @property def warnings(self): return [m for m in self.messages if m.severity == SEVERITY_WARNING] def validate_xbrl_instance(instance_path: Path) -> ValidationResult: """ Run full XBRL validation on an instance document: 1. Structural (schema conformance) 2. Calculation linkbase (summation consistency) 3. Formula linkbase (EBA business rules — requires formula plugin) Returns ValidationResult with all messages categorised. """ messages: List[ValidationMessage] = [] # ── Custom log handler to capture Arelle messages ───────────── class _ArelleLogHandler(logging.Handler): def emit(self, record): msg_text = self.format(record) severity = SEVERITY_ERROR if record.levelno >= 40 else \ SEVERITY_WARNING if record.levelno >= 30 else \ SEVERITY_INFO code = getattr(record, "messageCode", "") messages.append(ValidationMessage( severity=severity, code=code, message=msg_text, template=_extract_template(msg_text), concept=_extract_concept(msg_text), value=_extract_value(msg_text, "reported"), expected=_extract_value(msg_text, "expected"), )) log_handler = _ArelleLogHandler() arelle_logger = logging.getLogger("arelle") arelle_logger.addHandler(log_handler) # ── Run Arelle ──────────────────────────────────────────────── cntlr = Cntlr.Cntlr() modelMgr = ModelManager.ModelManager(cntlr) try: log.info("[xbrl_valid] Loading taxonomy...") modelXbrl = modelMgr.load(TAXONOMY_ENTRY) log.info("[xbrl_valid] Loading instance: %s", instance_path) instance = modelMgr.load(str(instance_path), modelXbrl=modelXbrl) # Step 1: Structural XBRL validation log.info("[xbrl_valid] Running structural validation...") ValidateXbrl.ValidateXbrl(modelMgr).validate(instance) # Step 2: Calculation linkbase validation log.info("[xbrl_valid] Running calculation validation...") ValidateXbrlCalcs(instance).validate() # Step 3: Formula linkbase (EBA business rules) # Requires Arelle formula plugin — loaded via cntlr if installed try: from arelle.plugin import formulaXbrl formulaXbrl.run(instance) log.info("[xbrl_valid] Formula validation complete.") except ImportError: log.warning("[xbrl_valid] Formula plugin not available — skipping EBA business rules.") finally: modelMgr.close() cntlr.close() arelle_logger.removeHandler(log_handler) error_count = sum(1 for m in messages if m.severity == SEVERITY_ERROR) warn_count = sum(1 for m in messages if m.severity == SEVERITY_WARNING) return ValidationResult( passed=error_count == 0, error_count=error_count, warn_count=warn_count, messages=messages, ) def _extract_template(msg: str) -> str: import re m = re.search(rr"C \d{2}\.\d{2}", msg) return m.group(0) if m else "" def _extract_concept(msg: str) -> str: import re m = re.search(rr"(?:ei:|c)\d{4}", msg) return m.group(0) if m else "" def _extract_value(msg: str, label: str) -> str: import re m = re.search(rfr"{label}[:\s=]+([0-9.\-]+)", msg, re.IGNORECASE) return m.group(1) if m else ""
6. The xbrl_valid.py Module
""" modules/xbrl_valid.py — Validate the generated XBRL instance against the EBA taxonomy. Runs three validation passes: 1. Structural (schema conformance) 2. Calculation linkbase (summation consistency) 3. Formula linkbase (EBA business rules — v-codes) On any ERROR-level finding, raises XbrlValidationError. Warnings are logged but do not halt the pipeline. Validation report is uploaded to MinIO as audit evidence. """ import json, logging, os from datetime import datetime, timezone from pathlib import Path from modules.base import BaseModule from xbrl.validate_instance import validate_xbrl_instance, ValidationResult log = logging.getLogger(__name__) XBRL_OUTPUT_DIR = Path(os.environ.get("XBRL_OUTPUT_DIR", "output/xbrl")) REPORT_DIR = Path(os.environ.get("XBRL_REPORT_DIR", "output/validation_reports")) class XbrlValidationError(RuntimeError): """Raised when the XBRL instance fails EBA validation rules.""" pass class XbrlValidModule(BaseModule): MODULE_NAME = "xbrl_valid" def input_check(self) -> None: """Verify at least one XBRL file exists in the output directory.""" xbrl_files = list(XBRL_OUTPUT_DIR.glob("*.xbrl")) if not xbrl_files: raise RuntimeError( f"[xbrl_valid] No .xbrl files found in {XBRL_OUTPUT_DIR}. " "Run xbrl_gen first: python pipeline.py --module xbrl_gen" ) # Validate the most recently generated file self._instance_path = max(xbrl_files, key=lambda p: p.stat().st_mtime) log.info("[xbrl_valid] Target instance: %s", self._instance_path) REPORT_DIR.mkdir(parents=True, exist_ok=True) def _execute(self) -> None: """Run full Arelle validation and raise on any ERROR-level finding.""" log.info("[xbrl_valid] Starting validation: %s", self._instance_path) result: ValidationResult = validate_xbrl_instance(self._instance_path) # ── Log summary ────────────────────────────────────────────── log.info( "[xbrl_valid] Validation complete: %s | errors=%d | warnings=%d", "PASS" if result.passed else "FAIL", result.error_count, result.warn_count, ) # ── Log individual errors ───────────────────────────────────── for msg in result.errors: log.error( "[xbrl_valid] ERROR [%s] template=%s concept=%s | %s", msg.code, msg.template, msg.concept, msg.message ) # ── Log warnings (informational — do not halt) ──────────────── for msg in result.warnings: log.warning( "[xbrl_valid] WARN [%s] %s", msg.code, msg.message ) # ── Write JSON validation report ────────────────────────────── ts = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ") report_path = REPORT_DIR / f"validation_report_{ts}.json" report_data = { "instance": str(self._instance_path), "validated_at": ts, "passed": result.passed, "error_count": result.error_count, "warn_count": result.warn_count, "errors": [ {"code": m.code, "template": m.template, "concept": m.concept, "reported": m.value, "expected": m.expected, "message": m.message} for m in result.errors ], "warnings": [ {"code": m.code, "message": m.message} for m in result.warnings ], } report_path.write_text(json.dumps(report_data, indent=2)) log.info("[xbrl_valid] Validation report written: %s", report_path) self._report_path = report_path self._upload_report_to_minio(report_path) self._write_audit(report_data) # ── Raise on errors ─────────────────────────────────────────── if not result.passed: error_summary = "; ".join( f"{m.code}({m.template})" for m in result.errors[:5] ) raise XbrlValidationError( f"[xbrl_valid] {result.error_count} EBA validation error(s): {error_summary}. " f"See report: {report_path}" ) def _upload_report_to_minio(self, report_path: Path) -> None: try: from minio import Minio client = Minio( os.environ.get("MINIO_ENDPOINT", "minio:9000"), access_key=os.environ.get("MINIO_ROOT_USER", "minioadmin"), secret_key=os.environ.get("MINIO_ROOT_PASSWORD", "minioadmin"), secure=False, ) bucket = "corep-xbrl-output" if not client.bucket_exists(bucket): client.make_bucket(bucket) client.fput_object( bucket, f"validation_reports/{report_path.name}", str(report_path), content_type="application/json", ) log.info("[xbrl_valid] Report uploaded → minio://%s/validation_reports/%s", bucket, report_path.name) except Exception as exc: log.warning("[xbrl_valid] MinIO upload failed (non-fatal): %s", exc) def _write_audit(self, report_data: dict) -> None: import json from modules.base import _pg_conn conn = _pg_conn() try: cur = conn.cursor() cur.execute( """ INSERT INTO audit.pipeline_run_log (run_id, module_name, status, metadata, ran_at) VALUES (%s, 'xbrl_valid', %s, %s, now()) """, ( self._run_id, "PASS" if report_data["passed"] else "FAIL", json.dumps({ "error_count": report_data["error_count"], "warn_count": report_data["warn_count"], "instance": report_data["instance"], "report": str(self._report_path), }), ), ) conn.commit() finally: conn.close() def emit_lineage(self) -> None: # Validation is read-only — input is the XBRL file, no new data written log.info("[xbrl_valid] No lineage event — validation is read-only.") def output_check(self) -> None: """Verify the validation report JSON was written successfully.""" report_path = getattr(self, "_report_path", None) if not report_path or not Path(report_path).exists(): raise RuntimeError("[xbrl_valid] Validation report file was not written.") data = json.loads(Path(report_path).read_text()) if not data.get("passed"): raise RuntimeError( f"[xbrl_valid] output_check: validation report shows FAIL with {data.get('error_count')} error(s)." ) log.info("[xbrl_valid] output_check: PASS — %s", report_path)
7. Reading Arelle Error Messages
Arelle’s raw error messages are dense. Here is how to decode the most common ones:
# Run validation manually (outside the pipeline module) python pipeline.py --module xbrl_valid # ── PASS output ────────────────────────────────────────────────────── INFO [xbrl_valid] Loading taxonomy... INFO [xbrl_valid] Loading instance: output/xbrl/COREP_2026-03-31_20260507T081433Z.xbrl INFO [xbrl_valid] Running structural validation... INFO [xbrl_valid] Running calculation validation... INFO [xbrl_valid] Formula validation complete. INFO [xbrl_valid] Validation complete: PASS | errors=0 | warnings=2 WARN [xbrl_valid] WARN [xbrlCalcs:insignificantRounding] C 02.00 ei:c0060 ... INFO [xbrl_valid] Validation report written: output/validation_reports/validation_report_20260507T081445Z.json INFO [xbrl_valid] output_check: PASS # ── FAIL output — calculation inconsistency ────────────────────────── ERROR [xbrl_valid] ERROR [xbrlCalcs:inconsistency] template=C 01.00 concept=ei:c0010 Calculation inconsistency in {http://www.eba.europa.eu/xbrl/crr/dict/con}c0010 reported sum 902000 computed sum 900000 Difference: 2000 (0.22%) Context: C_2026-03-31_instant # Fix: c0010 in your instance = 902000 but c0020+c0030+c0040 = 900000. # Check corep_c0100.sql — own_funds column must be the arithmetic sum # of cet1_capital + at1_capital + t2_capital, not independently computed. # ── FAIL output — EBA formula rule ─────────────────────────────────── ERROR [xbrl_valid] ERROR [EBA.v4789] template=C 03.00 Formula assertion failed: {ei}c0040 >= {ei}c0050 tier1_ratio (c0040) = 0.089000 but cet1_ratio (c0050) = 0.112583 # Fix: tier1_ratio < cet1_ratio — mathematically impossible. # Root cause: corep_c0300.sql reads tier1_ratio from int_capital_by_tier # which uses a different denominator (total_assets) than cet1_ratio # which uses total_rwa. Align both ratios to use total_rwa as denominator.
8. EBA Validation Rule Code Reference
| Arelle code | EBA rule | Template | What it checks | Typical root cause |
|---|---|---|---|---|
xbrlCalcs:inconsistency | Calculation linkbase | Any | Parent concept ≠ sum of children | Rounding inconsistency between mart model and individual rows |
xbrlCalcs:insignificantRounding | Calculation linkbase | Any | Difference ≤ 0.5 (within rounding tolerance) | Warning only — acceptable |
EBA.v0001 | v0001 | C 03.00 | CET1 ratio = CET1 / RWA | Different RWA denominator in C 03.00 vs C 02.00 total row |
EBA.v4789 | v4789 | C 03.00 | Tier 1 ratio ≥ CET1 ratio | Incorrect tier assignment in staging — AT1 counted in CET1 |
EBA.v5501 | v5501 | C 01.00 | Own Funds = CET1 + AT1 + T2 | Gap in capital_instruments source data — missing tier rows |
EBA.v2314 | v2314 | C 02.00 | Total RWA = sum of exposure class RWAs | TOTAL row computed separately from exposure-class rows |
EBA.v3301 | v3301 | C 47.00 | Level 2 HQLA ≤ 40% of total buffer | Synthetic data has too many Level 2 assets — cap not applied in dbt |
EBA.v8823 | v8823 | C 47.00 | LCR ratio ≥ 1.0 AND ratio = HQLA / outflows | Stressed outflow calculation wrong in int_lcr_outflows.sql |
EBA.v6102 | v6102 | C 03.00 | Leverage ratio = Tier 1 / exposure measure | Exposure measure not consistent with Tier 1 capital denominator |
xbrl:schemaImportMissing | Structural | Any | Taxonomy entry-point schema not reachable | Taxonomy path wrong or taxonomy files not downloaded |
xbrl:elementNotInSubstitutionGroup | Structural | Any | Concept not valid in the XBRL substitution group | Wrong EBA taxonomy version — concept IDs changed between versions |
9. Systematic Fix Strategy
9.1 Fixing Calculation Inconsistencies (xbrlCalcs:inconsistency)
-- Step 1: Find the discrepancy in Trino -- Compare C 01.00 own_funds against manual sum SELECT own_funds AS reported_c0010, cet1_capital + at1_capital + t2_capital AS computed_sum, own_funds - (cet1_capital + at1_capital + t2_capital) AS diff FROM mart.corep_c0100; -- If diff != 0 → fix corep_c0100.sql -- The own_funds column must be: cet1_capital + at1_capital + t2_capital -- NOT independently summed from raw.capital_instruments -- Step 2: Compare C 02.00 row sum against C 03.00 total_rwa SELECT c0200_total.rwa AS c0200_total_rwa, c0300.total_rwa AS c0300_total_rwa, c0200_total.rwa - c0300.total_rwa AS diff FROM ( SELECT SUM(rwa) AS rwa FROM mart.corep_c0200 WHERE exposure_class != 'TOTAL' ) c0200_total CROSS JOIN mart.corep_c0300 c0300; -- If diff != 0 → fix: c0300.total_rwa must reference int_rwa_by_exposure_class -- which is the same source as c0200 rows. Use a REF() in dbt, not a duplicate calculation.
9.2 The Golden Rule for Calculation-Clean XBRL
In your dbt mart models, every parent concept must be derived from its children using a SUM — never independently computed. Concretely:
corep_c0100.own_funds=cet1_capital + at1_capital + t2_capital(never re-sum from raw)corep_c0200.rwa WHERE exposure_class='TOTAL'=SUM(rwa) FROM corep_c0200 WHERE exposure_class != 'TOTAL'corep_c0300.total_rwa= reference to the same CTE that producescorep_c0200‘s total rowcorep_c4700.hqla_buffer=hqla_level1 + hqla_level2a_adjusted + hqla_level2b_adjusted
The calculation linkbase check is arithmetic. The only way to guarantee it passes is to make the parent a deterministic function of the children in the same SQL query.
9.3 Fixing Cross-Template Ratio Failures
-- Fix for EBA.v0001: CET1 ratio must use the EXACT same RWA total as C 02.00 -- In corep_c0300.sql, replace any independent total_rwa calculation with: WITH rwa_total AS ( -- This CTE must be IDENTICAL to the TOTAL row computation in corep_c0200.sql SELECT SUM(rwa) AS total_rwa FROM {{ ref('int_rwa_by_exposure_class') }} ), capital AS ( SELECT cet1_capital, at1_capital, t2_capital, own_funds FROM {{ ref('corep_c0100') }} ) SELECT capital.own_funds, capital.cet1_capital, rwa_total.total_rwa, -- Ratios: use NULLIF to prevent division by zero, ROUND to 6dp ROUND(capital.cet1_capital::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS cet1_ratio, ROUND((capital.cet1_capital + capital.at1_capital)::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS tier1_ratio, ROUND(capital.own_funds::numeric / NULLIF(rwa_total.total_rwa, 0), 6) AS total_capital_ratio FROM rwa_total CROSS JOIN capital; -- Key points: -- 1. rwa_total CTE uses int_rwa_by_exposure_class — the SAME model as corep_c0200 -- 2. No ROUND() on intermediate values — only terminal ROUND on the ratio -- 3. NULLIF(rwa_total, 0) prevents divide-by-zero producing NULL instead of error -- 4. tier1_ratio = (cet1 + at1) / rwa — not independently summed from instruments
10. Integrating Validation into the Airflow DAG
# dags/corep_pipeline_dag.py — validation branch logic def _xbrl_validation_branch(**context) -> str: """Return next task based on XBRL validation result stored in XCom.""" validation_status = context["task_instance"].xcom_pull( task_ids="run_xbrl_validation", key="validation_status" ) if validation_status == "PASS": return "prepare_submission_package" return "quarantine_failed_xbrl" # archives the XBRL + report to MinIO/quarantine/ def _run_xbrl_validation(**context) -> None: """Run xbrl_valid module and push status to XCom.""" from modules.xbrl_valid import XbrlValidModule, XbrlValidationError ti = context["task_instance"] try: mod = XbrlValidModule() mod.run() ti.xcom_push(key="validation_status", value="PASS") except XbrlValidationError as exc: log.error("XBRL validation FAILED: %s", exc) ti.xcom_push(key="validation_status", value="FAIL") # Do not re-raise — let BranchPythonOperator handle routing run_xbrl_validation = PythonOperator( task_id="run_xbrl_validation", python_callable=_run_xbrl_validation, ) branch_on_xbrl_validation = BranchPythonOperator( task_id="branch_on_xbrl_validation", python_callable=_xbrl_validation_branch, ) # Full pipeline DAG chain at this point: ( run_ingest >> run_quality_layer1 >> run_dbt_staging >> run_dbt_intermediate >> run_dbt_mart >> run_quality_layer2 >> run_catalog >> run_security >> run_xbrl_gen >> run_xbrl_validation # ← Day 12 >> branch_on_xbrl_validation ) branch_on_xbrl_validation >> prepare_submission_package # Day 13 branch_on_xbrl_validation >> quarantine_failed_xbrl
11. What “Passes Validation” Actually Means to the Regulator
| Validation check passed | What it proves | Regulatory significance |
|---|---|---|
| All structural checks | XBRL file is technically well-formed and uses valid EBA concepts | NCA filing portal accepts the file for processing |
| Calculation linkbase | Every parent concept equals the sum of its children — no arithmetic gaps | Internal consistency: your own funds components add up to total own funds |
| Cross-template ratio rules (v0001, v4789) | Capital ratios in C 03.00 are arithmetically consistent with the capital components in C 01.00 and C 02.00 | Supervisory comparability: the ratios the ECB monitors are derived from the correct inputs |
| Regulatory floor rules (v8823, v6102) | Reported LCR ≥ 100%, leverage ratio ≥ 3% | Confirms the bank meets minimum requirements — or the submission accurately reports a breach that triggers supervisory action |
| Level 2 cap rule (v3301) | HQLA composition respects the Del. Reg. 2015/61 concentration limits | Ensures the liquidity buffer is genuine — not loaded with lower-quality assets |
Even if your organisation uses the XBRL output for internal management reporting rather than direct NCA submission (because you go through a third-party filing agent), run Arelle validation. The validation rules encode the EBA’s understanding of what the numbers mean. A rule failure is a data quality finding — the same one your filing agent will catch, but you will catch it three weeks earlier at zero cost.
📚 Day 12 Key Takeaways
- Two classes of failure — structural errors mean your generator is broken; business rule errors mean your mart data is wrong. They require fixes in completely different places.
- Cross-template rules are the hardest to debug — EBA.v0001 fails because C 03.00 and C 02.00 use slightly different RWA computations. The fix is to make C 03.00’s total_rwa a direct reference to C 02.00’s source model, never an independent calculation.
- The Golden Rule: never compute a parent independently of its children. Every summation parent in XBRL must be the arithmetic sum of its children in the same dbt model, not re-derived from a raw source.
- Rounding strategy must be consistent — choose one rounding mode (ROUND_HALF_UP) and apply it only once, at the terminal XBRL generation step. Never round in intermediate dbt models if the rounded value will be summed again.
- The validation report is audit evidence — a JSON file timestamped before submission, uploaded to MinIO, that proves EBA validation was run and passed. It is the technical counterpart to the GX data docs from Day 8.
- Warnings are not errors —
xbrlCalcs:insignificantRoundingwarnings are acceptable when the difference is within ±0.5 of the reported unit. They do not cause NCA rejection. - Next: Day 13 — Building the COREP submission package: bundling the XBRL instance, validation report, and lineage evidence into a submission-ready archive with a covering note.

