The pipeline is complete. The data flows from source CSVs through dbt transformations, quality gates, XBRL generation, Arelle validation, and into a timestamped submission package. Fourteen days of engineering now produce validated COREP numbers every month — automatically.
But the COREP submission is not the only consumer of these numbers. The CFO needs to see whether the CET1 ratio is trending toward the minimum requirement. The risk team needs to see which exposure classes are driving RWA growth. The data governance team needs to know whether every pipeline run has passed quality gates, whether any XBRL validation errors have appeared in the last quarter, and whether the submission packages have been generated on schedule.
These are two different audiences with two different questions, but both are answered by data that already exists in your mart tables and audit log. The answer to both is Apache Superset, pointed at Trino, with two dashboards — one for capital management, one for governance operations.
This post connects Superset to Trino, builds every chart from SQL, assembles both dashboards, and shows you how to export the dashboard definitions as JSON so they can be version-controlled and reproduced in any environment.
1. Why Superset Connects to Trino — Not Directly to PostgreSQL
On Day 10 we established that Apache Ranger enforces RBAC at the Trino layer. Column masking (hashing PII), row filters (restricting to allowed legal entities), and access controls are all applied when a query passes through Trino. If Superset connected directly to PostgreSQL, it would bypass Ranger entirely — a dashboard user could query raw.counterparties.name and see unhashed LEI codes. Connecting Superset to Trino means every query — whether from a Python module, a data engineer’s notebook, or an executive’s dashboard — goes through the same enforcement point.
| Superset → PostgreSQL (direct) | Superset → Trino → PostgreSQL (this pipeline) |
|---|---|
| Bypasses Ranger — no column masking | All queries masked and filtered by Ranger |
| Dashboard user sees raw PII columns | PII columns hashed transparently — dashboard user sees HASH(lei) |
| No row-level filtering by legal entity | Row filter: entity_lei = current_user_lei() applied automatically |
| Cannot query Iceberg tables (Nessie catalog) | Trino federated: PostgreSQL + Iceberg in one FROM clause |
| Postgres-specific SQL dialect | ANSI SQL — portable across catalog backends |
2. Adding the Trino Database Connection
2.1 Via the Superset UI
# Step 1: Open http://localhost:8088 # Step 2: Settings → Database Connections → + Database # Step 3: Select "Trino" from the supported databases list # Step 4: Enter the SQLAlchemy URI: trino://corep_reporting@trino:8080/postgresql # URI breakdown: # trino:// — driver prefix (pip install trino sqlalchemy-trino) # corep_reporting — Trino user (mapped to Ranger role in ranger-trino-security.xml) # @trino:8080 — Trino coordinator (Docker service name + port) # /postgresql — Trino catalog name (from docker/trino/catalog/postgresql.properties) # Step 5: Advanced → Other → Extra JSON config: { "engine_params": { "connect_args": { "http_scheme": "http", "verify": false, "session_properties": { "query_max_run_time": "5m", "distributed_sort": true } } }, "metadata_params": {}, "metadata_cache_timeout": {"schema_cache_timeout": 3600} } # Step 6: Click "Test Connection" — should return "Connection looks good!" # Step 7: Save with display name: "COREP Trino"
2.2 Via the Superset CLI (for reproducible setup)
# superset/setup_connections.py — run once inside the Superset container # docker exec corep-superset python /app/superset/setup_connections.py import json from superset import app, db from superset.models.core import Database with app.app_context(): existing = db.session.query(Database).filter_by(database_name="COREP Trino").first() if existing: print("Connection already exists — skipping.") else: conn = Database( database_name="COREP Trino", sqlalchemy_uri="trino://corep_reporting@trino:8080/postgresql", expose_in_sqllab=True, allow_run_async=True, allow_dml=False, # READ-ONLY — dashboards never write allow_csv_upload=False, extra=json.dumps({ "engine_params": { "connect_args": { "http_scheme": "http", "verify": False, } }, "metadata_cache_timeout": {"schema_cache_timeout": 3600}, }), ) db.session.add(conn) db.session.commit() print("COREP Trino connection created.")
After adding the connection, go to SQL Lab → SQL Editor, select “COREP Trino” as the database and “mart” as the schema. Superset calls Trino’s SHOW TABLES IN postgresql.mart internally. You may need to explicitly select the catalog + schema as postgresql.mart in the schema dropdown — Trino exposes PostgreSQL schemas under the catalog name. Once discovered, the four mart tables will appear in the left panel of SQL Lab for drag-and-drop query building.
3. Virtual Datasets — SQL Queries That Power Every Chart
Superset’s “virtual dataset” feature lets you define a named SQL query that acts as a table for chart building. This is the correct approach for regulatory dashboards because it lets you encode the business logic — period filtering, ratio calculations, unit conversions — in one place rather than repeating it in every chart.
Dataset 1: Capital Ratios (C 03.00) — for trend lines and gauges
-- Superset Virtual Dataset: "ds_capital_ratios" -- Source: postgresql.mart.corep_c0300 -- Unit: ratios stored as decimals (0.1125 = 11.25%); multiply × 100 for display SELECT reporting_date, ROUND(cet1_ratio * 100, 2) AS cet1_ratio_pct, ROUND(tier1_ratio * 100, 2) AS tier1_ratio_pct, ROUND(total_cap_ratio * 100, 2) AS total_cap_ratio_pct, ROUND(leverage_ratio * 100, 2) AS leverage_ratio_pct, -- Regulatory minimums as reference lines 4.5 AS cet1_minimum, 6.0 AS tier1_minimum, 8.0 AS total_cap_minimum, 3.0 AS leverage_minimum, -- Capital conservation buffer (CRR2 Art. 128 — 2.5% above minimum) 7.0 AS cet1_conservation_buffer, -- Status flags for conditional formatting CASE WHEN cet1_ratio * 100 >= 7.0 THEN 'ABOVE_BUFFER' WHEN cet1_ratio * 100 >= 4.5 THEN 'ABOVE_MINIMUM' ELSE 'BELOW_MINIMUM' END AS cet1_status, total_rwa AS total_rwa_eur, ROUND(total_rwa / 1000000000, 2) AS total_rwa_bn_eur FROM postgresql.mart.corep_c0300 ORDER BY reporting_date
Dataset 2: Capital Components (C 01.00) — for bar charts and breakdowns
-- Superset Virtual Dataset: "ds_capital_components" -- Source: postgresql.mart.corep_c0100 -- Values in EUR billions for readability SELECT reporting_date, ROUND(own_funds / 1000000000, 3) AS own_funds_bn, ROUND(cet1_capital / 1000000000, 3) AS cet1_capital_bn, ROUND(at1_capital / 1000000000, 3) AS at1_capital_bn, ROUND(t2_capital / 1000000000, 3) AS t2_capital_bn, -- CET1 as % of own funds (capital quality indicator) ROUND(cet1_capital * 100.0 / NULLIF(own_funds, 0), 1) AS cet1_quality_pct FROM postgresql.mart.corep_c0100 ORDER BY reporting_date
Dataset 3: RWA by Exposure Class (C 02.00) — for treemap
-- Superset Virtual Dataset: "ds_rwa_exposure" -- Source: postgresql.mart.corep_c0200 -- Excludes the TOTAL row — Superset computes totals in-chart SELECT reporting_date, exposure_class, ROUND(rwa / 1000000000, 3) AS rwa_bn, ROUND(exposure_value / 1000000000, 3) AS exposure_bn, -- Risk density: RWA / Exposure (a key supervisory metric) ROUND(rwa * 100.0 / NULLIF(exposure_value, 0), 1) AS risk_density_pct FROM postgresql.mart.corep_c0200 WHERE exposure_class != 'TOTAL' ORDER BY reporting_date, rwa DESC
Dataset 4: LCR Components (C 47.00) — for gauge and waterfall
-- Superset Virtual Dataset: "ds_lcr" -- Source: postgresql.mart.corep_c4700 SELECT reporting_date, ROUND(lcr_ratio * 100, 2) AS lcr_ratio_pct, ROUND(hqla_buffer / 1000000000, 3) AS hqla_buffer_bn, ROUND(hqla_level1 / 1000000000, 3) AS hqla_level1_bn, ROUND(hqla_level2a / 1000000000, 3) AS hqla_level2a_bn, ROUND(hqla_level2b / 1000000000, 3) AS hqla_level2b_bn, ROUND(net_outflows / 1000000000, 3) AS net_outflows_bn, -- LCR surplus = how far above 100% we are ROUND((lcr_ratio - 1.0) * 100, 2) AS lcr_surplus_pct, CASE WHEN lcr_ratio >= 1.20 THEN 'COMFORTABLE' WHEN lcr_ratio >= 1.00 THEN 'COMPLIANT' ELSE 'BREACH' END AS lcr_status FROM postgresql.mart.corep_c4700 ORDER BY reporting_date
Dataset 5: Governance Audit Log — for pipeline health dashboard
-- Superset Virtual Dataset: "ds_governance_audit" -- Source: postgresql.audit.pipeline_run_log -- Note: Trino accesses the "audit" schema via the postgresql catalog SELECT module_name, status, ran_at, DATE(ran_at) AS run_date, CAST(metadata->>>'reporting_date' AS VARCHAR) AS reporting_date, CAST(metadata->>>'error_count' AS INTEGER) AS error_count, CAST(metadata->>>'warn_count' AS INTEGER) AS warn_count, run_id, CASE WHEN status = 'PASS' THEN 1 ELSE 0 END AS pass_flag, CASE WHEN status = 'FAIL' THEN 1 ELSE 0 END AS fail_flag FROM postgresql.audit.pipeline_run_log ORDER BY ran_at DESC
4. Dashboard 1 — COREP Capital & Liquidity
Here is what the completed capital dashboard looks like. The layout follows a standard regulatory management information (MI) report structure: headline ratios at top, trend over time in the middle, breakdown at the bottom.
18.2bn
12.8bn
8.1bn
4.2bn
4.1 Building the CET1 Ratio Trend Line Chart
# Superset chart configuration (JSON — paste into the "Advanced" chart editor) # Chart type: Line Chart # Dataset: ds_capital_ratios { "viz_type": "echarts_timeseries_line", "datasource": "ds_capital_ratios", "time_column": "reporting_date", "metrics": [ {"expressionType": "SIMPLE", "column": {"column_name": "cet1_ratio_pct"}, "aggregate": "MAX", "label": "CET1 Ratio %"}, {"expressionType": "SIMPLE", "column": {"column_name": "tier1_ratio_pct"}, "aggregate": "MAX", "label": "Tier 1 Ratio %"}, {"expressionType": "SIMPLE", "column": {"column_name": "cet1_minimum"}, "aggregate": "MAX", "label": "CET1 Minimum (4.5%)"}, {"expressionType": "SIMPLE", "column": {"column_name": "cet1_conservation_buffer"}, "aggregate": "MAX", "label": "CET1 Buffer (7.0%)"} ], "color_scheme": "airbnb", "extra_form_data": {}, "rich_tooltip": true, "show_legend": true, "legendType": "scroll", "legendOrientation": "top", "y_axis_format": ",.1f", "y_axis_title": "Ratio (%)", "seriesOverrides": { "CET1 Minimum (4.5%)": {"lineStyle": {"type": "dashed"}, "itemStyle": {"color": "#e53935"}}, "CET1 Buffer (7.0%)": {"lineStyle": {"type": "dashed"}, "itemStyle": {"color": "#f9a825"}} } }
4.2 Building the RWA Treemap
# Chart type: Treemap (ECharts) # Dataset: ds_rwa_exposure # Filter: reporting_date = latest available (use a native filter at dashboard level) { "viz_type": "treemap_v2", "datasource": "ds_rwa_exposure", "groupby": ["exposure_class"], "metric": { "expressionType": "SIMPLE", "column": {"column_name": "rwa_bn"}, "aggregate": "SUM", "label": "RWA (€bn)" }, "labelType": "key_value", "color_scheme": "blue_to_white", "show_labels": true, "number_format": ",.2f" } # Add a second metric for tooltip: risk_density_pct # This shows the risk density on hover — essential for risk managers # who need to understand not just the SIZE of each exposure class # but how capital-intensive it is relative to gross exposure.
4.3 Building the LCR Gauge
# Chart type: Gauge Chart (ECharts) # Dataset: ds_lcr # Shows LCR as a gauge from 0-200%, with colour bands: # 0-100%: red (breach), 100-120%: yellow (compliant, low headroom), 120%+: green { "viz_type": "gauge_chart", "datasource": "ds_lcr", "metric": { "expressionType": "SIMPLE", "column": {"column_name": "lcr_ratio_pct"}, "aggregate": "MAX", "label": "LCR (%)" }, "min_val": 0, "max_val": 200, "start_angle": 225, "end_angle": -45, "color_range": [ {"gt": 0, "lte": 100, "color": "#e53935"}, {"gt": 100, "lte": 120, "color": "#f9a825"}, {"gt": 120, "lte": 200, "color": "#34a853"} ], "number_format": ",.1f", "show_pointer": true, "show_split_line": true }
5. Dashboard 2 — Governance Audit & Pipeline Health
The second dashboard is for the data governance team. It answers “is the pipeline running correctly, are all quality gates passing, and are submissions being produced on schedule?” This is the operational view of the governance infrastructure — as important to a Chief Data Officer as the capital ratios are to the CFO.
| Reporting Date | Module | Status | XBRL Errors | Ran At |
|---|---|---|---|---|
| 2026-03-31 | pipeline_complete | PASS | 0 | 2026-03-31 01:19:03 |
| 2026-02-28 | pipeline_complete | PASS | 0 | 2026-02-28 01:17:44 |
| 2026-01-31 | pipeline_complete | PASS | 0 | 2026-01-31 01:21:09 |
5.1 The Pipeline Run Log Chart SQL
-- Chart: Pipeline Run Log Table -- Chart type: Table (with conditional row colouring) -- Dataset: ds_governance_audit -- This is the SQL that drives the audit run history table SELECT reporting_date, module_name, status, error_count, warn_count, TO_CHAR(ran_at, 'YYYY-MM-DD HH24:MI:SS') AS ran_at_fmt, EXTRACT(EPOCH FROM (ran_at - LAG(ran_at) OVER ( PARTITION BY reporting_date ORDER BY ran_at ))) / 60 AS minutes_since_prev_module FROM postgresql.audit.pipeline_run_log WHERE module_name = 'pipeline_complete' ORDER BY ran_at DESC LIMIT 12; -- Superset conditional formatting (Table chart settings): -- If status = 'PASS' → row background: rgba(52, 168, 83, 0.15) [green tint] -- If status = 'FAIL' → row background: rgba(229, 57, 53, 0.15) [red tint] -- Apply via: Chart → Customize → Conditional Formatting → Add Rule
5.2 The Module Pass Rate Heatmap SQL
-- Chart: Module Pass Rate Heatmap -- Chart type: Heatmap -- Dataset: ds_governance_audit SELECT TO_CHAR(ran_at, 'YYYY-MM') AS run_month, module_name, ROUND( SUM(pass_flag) * 100.0 / COUNT(*), 0 ) AS pass_rate_pct FROM postgresql.audit.pipeline_run_log WHERE module_name NOT IN ('pipeline_complete', 'quarantine_raw', 'quarantine_mart', 'quarantine_xbrl') AND ran_at >= NOW() - INTERVAL '90 days' GROUP BY 1, 2 ORDER BY run_month, module_name; -- Heatmap config: -- X-axis: run_month Y-axis: module_name Value: pass_rate_pct -- Color scale: 0% = #e53935 (red) → 100% = #34a853 (green) -- Linear gradient scale: 0→50→100
6. Native Filters — Making Both Dashboards Interactive
Superset’s Native Filters allow a single date selector or dropdown to update every chart on the dashboard simultaneously. For a regulatory dashboard, you need two filters: a reporting date selector and a period type filter (monthly vs quarterly).
# Dashboard → Filters → + Add/Edit Filters # Configure these two filters to apply to ALL charts on the dashboard: # Filter 1: Reporting Date Range Name: "Reporting Period" Filter type: Time range Default value: Last quarter Target: All datasets (applies to both capital and audit datasets) # Filter 2: Reporting Date Point (for KPI scorecards — show single period) Name: "As-of Date" Filter type: Value (single select) Column: reporting_date Default: MAX(reporting_date) # shows latest by default Target: ds_capital_ratios, ds_capital_components, ds_rwa_exposure, ds_lcr # The KPI scorecard Big Number charts use the "As-of Date" filter # so they always show a single value — the most recent reporting date # (or whichever date the user selects). # The trend line chart uses the "Reporting Period" filter # so it shows the full 12-month window. # This is the correct pattern for regulatory MI: # - Headline scorecards: point-in-time value for the selected reporting date # - Trend lines: rolling 12-month history # - Breakdown charts (treemap, stacked bar): point-in-time composition
7. Exporting Dashboard Definitions to JSON
Superset dashboards are exportable as a ZIP archive containing JSON definitions for charts, datasets, and the dashboard layout. This means your dashboards can be version-controlled in Git and reproduced in any environment by importing the ZIP.
# Export via Superset CLI inside the container: docker exec corep-superset superset export-dashboards \ --dashboard-ids 1,2 \ --output /app/superset/exports/corep_dashboards_$(date +%Y%m%d).zip # Copy the export to your project's version-controlled directory: docker cp corep-superset:/app/superset/exports/corep_dashboards_20260331.zip \ superset/dashboard_exports/ # Import in a new environment: docker exec corep-superset superset import-dashboards \ --path /app/superset/exports/corep_dashboards_20260331.zip # Via Superset API (for programmatic use in CI/CD): curl -X POST \ -H "Authorization: Bearer ${SUPERSET_TOKEN}" \ -F "formData=@superset/dashboard_exports/corep_dashboards_20260331.zip" \ "http://localhost:8088/api/v1/assets/import/"
# superset/create_datasets.py — programmatic dataset creation via Superset API # Run this during environment bootstrap to avoid manual UI steps import json, os, urllib.request, urllib.parse SUPERSET_URL = os.environ.get("SUPERSET_URL", "http://localhost:8088") ADMIN_USER = os.environ.get("SUPERSET_ADMIN_USER", "admin") ADMIN_PASS = os.environ.get("SUPERSET_ADMIN_PASS", "admin") VIRTUAL_DATASETS = [ { "dataset_name": "ds_capital_ratios", "sql": """ SELECT reporting_date, ROUND(cet1_ratio * 100, 2) AS cet1_ratio_pct, ROUND(tier1_ratio * 100, 2) AS tier1_ratio_pct, ROUND(total_cap_ratio * 100, 2) AS total_cap_ratio_pct, ROUND(leverage_ratio * 100, 2) AS leverage_ratio_pct, 4.5 AS cet1_minimum, 7.0 AS cet1_conservation_buffer, total_rwa FROM postgresql.mart.corep_c0300 ORDER BY reporting_date """, }, { "dataset_name": "ds_governance_audit", "sql": """ SELECT module_name, status, ran_at, CAST(metadata->>'reporting_date' AS VARCHAR) AS reporting_date, CAST(metadata->>'error_count' AS INTEGER) AS error_count, CASE WHEN status = 'PASS' THEN 1 ELSE 0 END AS pass_flag FROM postgresql.audit.pipeline_run_log ORDER BY ran_at DESC """, }, ] def get_token() -> str: data = json.dumps({ "username": ADMIN_USER, "password": ADMIN_PASS, "provider": "db", "refresh": True, }).encode() req = urllib.request.Request( f"{SUPERSET_URL}/api/v1/security/login", data=data, headers={"Content-Type": "application/json"}, ) resp = json.loads(urllib.request.urlopen(req, timeout=15).read()) return resp["access_token"] def get_db_id(token: str, db_name: str = "COREP Trino") -> int: req = urllib.request.Request( f"{SUPERSET_URL}/api/v1/database/?q=(filters:!((col:database_name,opr:eq,val:'{db_name}')))", headers={"Authorization": f"Bearer {token}"}, ) resp = json.loads(urllib.request.urlopen(req, timeout=15).read()) return resp["result"][0]["id"] def create_virtual_dataset(token: str, db_id: int, name: str, sql: str) -> None: payload = json.dumps({ "database": db_id, "schema": "mart", "sql": sql.strip(), "table_name": name, }).encode() req = urllib.request.Request( f"{SUPERSET_URL}/api/v1/dataset/", data=payload, headers={ "Authorization": f"Bearer {token}", "Content-Type": "application/json", }, ) urllib.request.urlopen(req, timeout=15) print(f"Created dataset: {name}") if __name__ == "__main__": token = get_token() db_id = get_db_id(token) for ds in VIRTUAL_DATASETS: create_virtual_dataset(token, db_id, ds["dataset_name"], ds["sql"])
8. Row-Level Security in Superset — Ensuring Dashboard Users Only See Their Data
Superset has its own Row Level Security (RLS) feature, but in this pipeline it is redundant with Ranger. Since all Superset queries go through Trino and Ranger’s row filter policies apply automatically (e.g. entity_lei = current_user_lei()), the data scoping happens at the query engine layer. You do not need to duplicate the logic in Superset RLS. The correct approach is to leave Superset RLS disabled and rely entirely on Ranger — this ensures a user who bypasses Superset and queries Trino directly also sees only their permitted data.
| Superset role | Trino user mapped to | Ranger role | Dashboard access |
|---|---|---|---|
corep_exec | corep_reporting | corep_reporting | Capital & Liquidity dashboard — read-only, all mart tables |
corep_risk | risk_analyst | risk_analyst | Capital & Liquidity dashboard — mart views only, no raw access |
corep_governance | data_engineer | data_engineer | Both dashboards — capital + governance audit log |
Admin | corep_admin | All roles | Full access — SQL Lab, dashboard editing, dataset management |
📚 Day 15 Key Takeaways
- Superset connects to Trino only — never directly to PostgreSQL. This single decision ensures that Ranger RBAC, column masking, and row filters apply to every dashboard query. A report that bypasses Ranger is not a governed report, regardless of how good the charts look.
- Virtual datasets are the right abstraction for regulatory dashboards. Encode the business logic — unit conversions (÷1,000,000,000), percentage formatting (×100), status flags, regulatory minimum reference values — in the dataset SQL once. Every chart reuses it without duplication.
- Two dashboards serve two different audiences. The capital dashboard (CET1 trend, RWA treemap, LCR gauge) is for risk and finance. The governance audit dashboard (pipeline run log, module pass rate heatmap, XBRL error count) is for the data governance team. Both use the same infrastructure but answer completely different questions.
- Native Filters with a time range + point-in-time pair is the correct pattern for regulatory MI: scorecard KPIs show the latest value; trend lines show rolling history. Both update together from a single date selector.
- Export dashboards as ZIP and commit to Git. Dashboard definitions are infrastructure — version-controlled, reviewable, reproducible in any environment via
superset import-dashboards. Never treat dashboards as manual UI artefacts. - The governance audit dashboard closes the loop on BCBS 239 P11 (Distribution). The data governance team can see at a glance that every pipeline run passed quality gates, that no XBRL validation errors occurred, and that submission packages were produced on schedule — without querying any database directly.
- Next: Day 16 — Running the complete 18-day pipeline end-to-end: from a cold Docker stack to a validated XBRL submission package, checking every artefact in MinIO, and reading the final audit log.

