TRF Benchmark Dashboard

A daily-refreshed view of Total Reach & Frequency campaign benchmarks for the MSCI team. This is the operating model you're inheriting as the dashboard's owner.

For: Landis Current From: Sid Dani Date: 2026-05-20 Meeting: Wed 10:15 PT — Dash Follow Up

Why this document. This dashboard started as a side-quest — Sid built v1 (March snapshot) and the v2 daily-refresh pipeline because the data needed a home. You are the actual product owner. This brief is here so you can carry it forward without re-discovering what's already been decided.

1. What this is

The TRF Benchmark Dashboard answers questions like "how does this campaign's reach compare to other Auto / 30-day / Medium-sized campaigns?" for the MSCI measurement constituency.

Two versions exist in the world right now:

Version	URL	Data freshness	Notes
v1.5 (March snapshot)	`trf-benchmark.pages.dev`	2026-03-31, frozen	Static. Honest header text. What MSCI sees today.
v2 (dev pipeline)	`trf-benchmark-dev.pages.dev`	2026-05-07, 19-campaign sample	Live pipeline. Daily refresh built but first full-universe run blocked (see §8).

2. Data lifecycle

The whole loop runs once per day at 03:00 UTC, fully unattended. You don't trigger it manually under normal conditions.

┌──────────────────┐ 1. ASK ┌─────────────────┐ 2. STORE ┌──────────────────┐ │ Currency API │ ───────────▶ │ Cloud Run job │ ───────────▶ │ GCS audit │ │ (msci-mcp side) │ │ trf-ingestion │ │ bucket │ │ │ │ │ │ (raw JSON, 30d) │ │ ~6,874 TRF │ │ Runs daily at │ └──────────────────┘ │ oracles │ │ 03:00 UTC │ │ │ (campaigns) │ │ │ ▼ └──────────────────┘ │ 20 campaigns │ ┌──────────────────┐ │ in parallel │ 3. CLEAN │ BigQuery │ │ │ ───────────▶ │ trf_benchmarks │ │ 7 quality │ │ (3 tables) │ │ rules applied │ └──────────────────┘ │ │ │ │ │ ▼ │ 5-check │ 4. BUILD ┌──────────────────┐ │ validation │ ───────────▶ │ HTML dashboard │ │ gate │ │ (regenerated) │ │ │ └──────────────────┘ │ │ │ │ │ 5. PUBLISH ┌──────────────────┐ │ │ ───────────▶ │ Cloudflare │ └─────────────────┘ │ Pages │ └──────────────────┘

Each stage has its own guardrails. If anything fails, the dashboard stays at its previous state and the failure is logged in the audit trail (see §6).

3. Data sources — what's an oracle, what's a job

Thing	What it is	Where it lives
Oracle	A single TRF campaign (e.g. "Choice Hotels Comp Analysis")	Currency API — msci-mcp side
Job	One execution of a campaign's analytics. Each oracle has many jobs over time; we use the latest `COMPLETE` one.	Currency API
Metrics tables	16 per-job tables: `overall`, `daily`, `hml`, `bucket_freq`, `on_target`, `exp_overlap`, `universe`, plus 9 `platform_by_*` cuts (state, dma, gender, age, property, device, creative, tactic, lineitem)	Pulled by the daily job
Industry	A campaign's industry (Auto, Retail, Pharma…). Hand-classified — no live source. Currently covers ~12% of campaigns; the rest fall to "Uncategorized."	Joined from v1 BQ historical table at clean time

Discovery from M3: The Currency API exposes 6,874 TRF oracles with lastJobStatus=COMPLETE — 8× larger than the ~857 in the v1 xlsx-derived dataset. The xlsx was a hand-curated subset; the live API exposes the full universe.

4. Where everything is stored

Location	Purpose	Retention
`gs://ai-workflows-trf-audit-dev/pulls/YYYY-MM-DD/`	Raw API responses, one JSON file per oracle per day	30 days, auto-deleted
`ai-workflows-459123.trf_benchmarks_dev.fact_campaigns_raw`	Everything that came in, before quality filtering	Indefinite, day-partitioned
`ai-workflows-459123.trf_benchmarks_dev.fact_campaigns`	The clean published set after 7 quality rules	Indefinite, day-partitioned
`ai-workflows-459123.trf_benchmarks_dev.audit_log`	One row per daily run — when it ran, how many in/out, did validation pass	Indefinite
Cloudflare Pages `trf-benchmark-dev`	The actual HTML dashboard	Last few deploys kept; rollback in CF dashboard

5. What happens daily, automatically

Time (UTC)	What runs	Who triggers it
03:00	Cloud Scheduler fires `trf-ingestion-daily`	automatic
03:00–04:30	Cloud Run pulls ~6,874 campaigns, archives raw JSON to GCS, normalizes, writes clean BQ rows, regenerates HTML, deploys to Cloudflare Pages	automatic
~04:30	One `audit_log` row written: `status=success` or `status=failed` with reason	automatic
30 days later	GCS lifecycle deletes that day's `pulls/` folder	automatic

If the job fails, Cloud Run retries once automatically. If both attempts fail, the dashboard stays at its previous state and the audit_log records the failure for triage.

6. How you know it's healthy

Three checks, in increasing depth. The first one is sufficient 99% of the time.

Check 1 — Dashboard renders fresh data (10 seconds)

Visit trf-benchmark-dev.pages.dev and look at the header. It should say something like:

"N campaigns · Refreshed YYYY-MM-DD · v2 daily ingestion"

The date should be today or yesterday. If it's more than 2 days old, something stopped.

Check 2 — The audit log shows success (30 seconds)

In the Google Cloud console → BigQuery, run:

SELECT started_at, status, oracles_completed, records_ingested, validation_passed, failure_reason FROM `ai-workflows-459123.trf_benchmarks_dev.audit_log` ORDER BY started_at DESC LIMIT 5

You want to see the top row with status="success", validation_passed=true, failure_reason=null, and started_at being this morning.

Check 3 — The job execution is green (1 minute)

Google Cloud console → Cloud Run → Jobs → trf-ingestion. The Executions tab shows every run. The most recent one should be green ✅. Click in to see logs if red ❌.

7. The 7 quality rules + 5-check validation

Not every campaign that comes back from the API makes it to the dashboard. Two layers of guardrails:

Quality rules (per-campaign filtering)

is_no_data — API returned {"message": "Not enough data to generate this report"} → exclude from published set
is_test — campaign name matches test heuristics ("test", "demo", internal-test patterns) → exclude
is_unclassified — industry is null + name doesn't auto-resolve → flag
is_pending — always false (no live source for pending state)
industry_canonicalization — map raw industry strings to canonical taxonomy ("Pharma" → "Pharma & Healthcare")
has_digital — non-zero digital reach metric
with_industry — has a non-null canonical industry

5-check validation gate (per-run sanity)

gcs_archives_exist — every published campaign has a raw JSON in GCS
dashboard_matches_bq — rendered HTML campaign count matches fact_campaigns row count
no_duplicate_oracle_job_in_raw — no duplicate (oracle_id, job_id) pairs in fact_campaigns_raw
no_null_last_run_in_fact_campaigns — every published campaign has a real last-run timestamp
no_oracle_drift_vs_api — the set of oracles in BQ matches the set the API said it had today

If any check fails, the deploy is aborted and the failure is logged in audit_log.failure_reason. The dashboard stays at the previous version.

8. Things that can go wrong

Symptom	What it means	First-aid
Dashboard date is >2 days old	Cron didn't run successfully	Check `audit_log` → Cloud Run Executions → root-cause from logs
`audit_log.failure_reason` mentions "Auth0 token request failed"	Currency API credentials rotated	Ping Sid to refresh AUTH0_* secrets
`audit_log.failure_reason` mentions "validation check X failed"	Data quality changed upstream	Capture run_id, ping Sid — needs investigation
`audit_log.failure_reason` mentions "cloudflare deploy failed: 401"	CF API token rotated/revoked	Ping Sid to regenerate `cloudflare-api-token` secret
Container terminated on signal 9	Out-of-memory kill	This is the current open issue — see callout below

Known open issue (as of 2026-05-20): The first full-universe production run OOM-killed at ~3,800 of 6,874 oracles. The pull phase accumulates raw API responses in memory; ~3.4 GB working set exceeded the 2 GiB container cap. The Cloud Scheduler is currently paused so tomorrow's 03:00 UTC cron doesn't fire and OOM unattended. Three recovery paths are on the table (bump memory, limit scope, or refactor to stream) — to be decided in this meeting or shortly after.

9. What you own going forward

Operation	Who	How often
Verify daily run succeeded	you (any team member)	Once weekly; ad-hoc when MSCI flags stale data
Decide what changes to v2 architecture	you	As needed
Approve feature additions (Edits, Downloadables, etc.)	you	As needed
Triage failed runs with engineering	you + Sid initially, then your engineering partner long-term	When `audit_log` shows failed
Rotate secrets (Auth0, Cloudflare)	Sid for now; engineering team long-term	Yearly or when rotated
Add new industry to canonicalization map	Engineering (small PR to `normalize.py`)	When MSCI taxonomy adds a category
Stakeholder comms / MSCI questions	you	As they arise

What you can do today without engineering help: All the read-only health checks in §6. Open Google Cloud console, view BigQuery, view Cloud Run executions, view Cloud Scheduler. You don't need a terminal or code editor.

10. Escalation

Needs code change / GCP config / secret rotation: Sid Dani — sid.dani@samba.tv
Data correctness ("the metric is wrong"): Sid first, who'll route to MSCI (Alyson Sprague / Ankitha Ananda) or Currency API team if upstream
Dashboard access / SSO: Sid (Cloudflare Access policy pending — ATF-458)
Architecture decisions about the dashboard itself: you — that's the handoff