ATRSA Production-Readiness Audit

Full multi-reviewer convergence audit · mvp_v1 @ 1dbe5e9 · 2026-05-20 · v2

Reviewer Panel

#	Reviewer	Method	Verdict
1	Codex GPT-5.5	R1: 110 findings, 6 categories	NO-GO
2	Grok 4.3	R1: 5 blockers + 5 systemic risks → R3 binding	GO-W/C (R1) → NO-GO (R3)
3	Claude main session	R1: manual security red-team	GO-WITH-CONDITIONS
4	code-quality agent	arch + type safety + schema	defects
5	test-coverage agent	80-file suite vs 598-file codebase	defects
6	deploy-infra agent	Dockerfile + fly + Sentry + DB	defects
7	Gemini 2.5-pro	—	BLOCKED (monthly cap)
8	feature-readiness agent	FEATURES_STATUS.md vs reality (retry)	defects
9	/red-team Agent 1	Input validation & injection	5C / 7H
10	/red-team Agent 2	Process & resource safety	5C / 7H
11	/red-team Agent 3	Trust boundary & identity (6 verified + 19 new)	defects
12	/red-team Agent 4	Dependency & supply chain	3C / 8H
13	deepseek-r1:32b (local)	Replaced Gemini quota	NO-GO
—	R3 Codex co-CTO binding	All 21 blockers bound	NO-GO
—	R3 Grok co-CTO binding	All 21 blockers bound	NO-GO

The 21 Bound Blockers

Each blocker bound by both co-CTOs independently. Where they disagreed on severity, the higher severity is taken (most-conservative rule for a regulated financial product). All file:line citations source-verified.

CRITICAL (12)

B1 · API v2 transfer IDOR — no owner filter

Codex: CRITICAL · Grok: CRITICAL · BIND

app/api/v2/transfers/[id]/route.ts:62-63, 186-188 — findById(id, environment) with no owner check. Any API key with transfers:read can read any transfer in the same environment; transfers:write can accept/reject/retry any transfer.

B2 · Customer erasure uses transfers:write scope, ignores LegalHold

Codex: CRITICAL · Grok: CRITICAL · BIND

app/api/v2/customers/[id]/erase/route.ts:83-93, 123 — irreversible KYC destruction authorized by transfers:write; LegalHold model never consulted. grep confirms zero LegalHold references in the entire file.

B3 · Veriscope inbound — unsigned, state-mutating, no body cap, zip-bomb DoS

Codex: CRITICAL · Grok: CRITICAL · BIND

app/api/system/trp/veriscope/incoming/route.ts:391-421 mutates KYC templates without sender verification; middleware.ts:96-97 + csrf.ts:151-153 bypass both auth and CSRF; attestation-decoder.ts uses unbounded inflateSync on attacker payload (zip-bomb DoS with worker concurrency-1 → entire compliance pipeline stalls).

B5 · Compliance checks run AFTER IVMS exchange (regulatory sequencing failure)

Codex: CRITICAL · Grok: HIGH → bound CRITICAL · BIND

app/lib/jobs/transfer-queue.ts:211-213 — the engineer's own comment: "Compliance checks do NOT run here. They run after both IVMS are exchanged." Originator PII transmits to counterparty BEFORE sanctions screening. If a transfer would be sanctions-blocked, the disclosure already happened.

B7 · Middleware admin gate dead code + withRBAC fails open + admin pages no RBAC

Codex: CRITICAL · Grok: HIGH → bound CRITICAL · BIND

Triple failure: (a) middleware.ts:184-193 gates non-existent /api/v1/*, (b) rbac.ts:91-95 returns allowed:true for unmapped paths (17 routes in ROUTE_PERMISSIONS vs 22 admin sections + 17 API routes), (c) admin server components call only requireAuth(). Net effect: a viewer-role user can read every admin page.

B10 · NO tenant/ownership model — STRUCTURAL ★ UNANIMOUS #1

Codex: CRITICAL · Grok: CRITICAL · BIND · both co-CTOs' top priority

prisma/schema.prisma — TravelRuleTransfer and Customer have NO userId ownership field. ATRSA's authorization model assumes ONE trust domain per environment. This is the structural gap behind every IDOR finding. B1, B14, B2 cannot be fixed correctly without B10 first.

B11 · AuditLog has no hash chain + v2 routes skip AuditLog writes

Codex: CRITICAL · Grok: HIGH → bound CRITICAL · BIND

prisma/schema.prisma:194-211 — no prevHash / recordHash / signature columns. AUDIT_SIGNING_KEY + AUDIT_PUBLIC_KEY declared in env but never used (grep: 0 hits outside env.ts). /api/v2/** — 10 of 10 v2 routes write only Activity, not AuditLog. FEATURES_STATUS.md falsely claims "Immutable Audit Trail — Hash chain, content hashing, auto-lock" + "Digital Signatures — RSA-SHA256 signed audit exports."

B12 · Peer-controlled SSRF via TrustAnchor API_URL (cloud-metadata exfil)

Codex: CRITICAL · Grok: CRITICAL · BIND

app/lib/adapters/veriscope/webhook-dispatcher.ts:94 — outbound peer dispatcher SKIPS the SSRF guard that webhook-service.ts:76-135 applies. A peer who publishes a TrustAnchor API_URL pointing at 169.254.169.254 (AWS metadata service) extracts encrypted IVMS via the outbound webhook. The encryption is irrelevant — the attacker controls the destination.

B13 · Crypto-proof validator fails OPEN when Python verifier missing

Codex: CRITICAL · Grok: HIGH → bound CRITICAL · BIND

app/lib/adapters/veriscope/transition-validation.ts:299-301 — when the external Python verifier process is unavailable (likely on fly.io alpine container), the code silently APPROVES every BE_CRYPTO_PROOF_VERIFIED transition. Combined with B16 (env-leak via execFile), this is a compound failure: validator present = secrets leak; absent = validation fail-open.

B14 · Customer deactivate/reactivate — same IDOR + wrong-scope as B2

Codex: CRITICAL · Grok: CRITICAL · BIND

app/api/v2/customers/[id]/deactivate/route.ts + reactivate/route.ts — parallel pattern to B2. transfers:write scope, no owner filter, environment-only lookup.

B15 · Key rotation misses 3 of 7 encrypted entity classes — PERMANENT DATA LOSS

Codex: CRITICAL · Grok: CRITICAL · BIND

app/lib/utils/key-rotation.ts:519-537 — performKeyRotation covers TravelRuleProvider config, Integration config, User 2FA, TravelRuleTransfer IVMS. NOT covered: Customer.keys (per-customer secp256k1), webhook signing secrets, CTR regulatory fields (customerName, customerDob, customerIdNumber, conductingPersonName). After rotation + old-key removal, all customer veriscope keypairs become permanently undecryptable. Script reports success.

B4 · 2FA secret asymmetry — broken 2FA in production

Codex: HIGH · Grok: HIGH (bound HIGH but production-breaking) · BIND

app/lib/actions/auth.ts:127,134 signs with NEXTAUTH_SECRET; :184 verifies with TWO_FA_JWT_SECRET || NEXTAUTH_SECRET. In production where TWO_FA_JWT_SECRET is required AND different from NEXTAUTH_SECRET, sign uses key A and verify expects key B → tokens fail. Source-verified via direct read. NOTE: auth/config.ts:48-92 has a parallel 2FA path with consistent secret usage — two code paths exist.

HIGH (8)

B6 · CSRF v4 cookie names; v5 uses authjs.* (runtime verify required)

BIND

csrf.ts:197-202

B8 · Migration advisory lock illusory (separate psql sessions)

BIND

start.sh:27-40

B9 · Login rate-limit fail-open on Redis (DB fallback in limiter mitigates)

BIND

auth/config.ts:297-306 — limiter has DB fallback per Agent 2 correction; fix the auth-route .catch(()=>null), not the limiter.

B16 · Secrets leak to Python subprocess via execFile (full env passed)

BIND

NEXTAUTH_SECRET, CONFIG_ENCRYPTION_KEY, DATABASE_URL all leak on every cross-VASP transition.

B18 · Sentry replays 1.0 + no PII scrubber + raw process.env read

BIND

instrumentation-client.ts:11 — captures IVMS PII forms on every error event. No DPIA.

B19 · V1 doc-vs-code drift — FEATURES_STATUS describes 13 non-existent routes

BIND

Doc inverts V1/V2 maturity claim. V1 endpoints in doc all 404.

B20 · LOG_SHIPPING_S3_* prod-required but unused

BIND

4 prod-required env vars; 0 code references. "FATF 7-year retention" claim fictional.

B21 · ws@8.17.1 CVE GHSA-58qx-3vcg-4xpx silenced by --audit-level=high

BIND

Known CVE bypassed in CI.

MEDIUM (1)

B17 · @auth/prisma-adapter@2.11.1 peer-dep does not declare Prisma 7 support

BIND

Adapter behavior silently undefined with @prisma/client@7.8.0.

Co-CTO Top Priorities

	Codex GPT-5.5	Grok 4.3	Convergence
#1 must-fix	B10 (tenant model)	B10 (tenant model)	B10 ✓ unanimous
#2 must-fix	B3 (Veriscope)	B1 (IDOR)	split — both bound CRITICAL
#3 must-fix	B5 (compliance order)	B12 (SSRF)	split — both bound CRITICAL

Sprint Roadmap (revised v2)

Sprint 0 — Hold deploy

0-2 days

Update FEATURES_STATUS.md to mark V1 doc claims as obsolete
Disable Sentry session replay until DPIA approves (1-line config)
Verify B6 runtime (5 min — document.cookie in dev session)
Verify B4 (run 2FA flow with split secrets in stage)

Sprint 1 — Stop the bleeding (12 CRITICAL)

1-2 weeks · no external traffic until done

B10 first (unanimous #1): add userId ownership to TravelRuleTransfer, Customer, Wallet, Webhook. Backfill rows.
B1 + B14: ownership filter on every findById after B10 lands
B2: customers:erase scope + consult LegalHold model
B3: Veriscope inbound signature verification + body cap + inflate size limit
B5: compliance check BEFORE provider cascade
B7: invert withRBAC default to fail-CLOSED; register every route; remove viewer from adminUiRoles
B12: SSRF guard on webhook-dispatcher.ts
B13: fail-CLOSED when Python verifier missing
B15: rotator-registry walking every encryptSecret( call site + regression test
B11: AuditLog hash-chain columns + always write from v2 routes

Sprint 2 — Cover the cliff

2 weeks · test coverage on 15 untested API routes

Integration test per route × {happy, no-auth, wrong-scope, malformed, rate-limited}
middleware.ts branch tests (231 LOC currently untested)
ECIES round-trip tests vs Veriscope reference vectors
IDOR negative tests as regression fence
coverage.all: true in vitest, coverage.include: ['app/**/*.ts']

Sprint 3 — Compliance + Production hygiene (8 HIGH)

1-2 weeks

B6 (CSRF v5 cookies), B8 (migration lock), B9 (rate-limit fail-open in auth), B16 (env-leak), B18 (Sentry PII), B19 (doc reconcile), B20 (LOG_SHIPPING), B21 (ws CVE)
Split Fly process groups: web and worker
AAD context binding in AES-GCM
Separate PII_ENCRYPTION_KEY from CONFIG_ENCRYPTION_KEY
Daily Postgres backup + monthly verified restore drill

Sprint 4 — Architectural cleanup

ongoing · debt

Split transfer-workflow.ts (1862 LOC) into per-workflow modules
Split veriscope/incoming/route.ts (1334 LOC) into thin route + service
Migrate /api/v2/transfers POST to createOutgoing
next-auth module augmentation; remove all as any on session.user.role
Replace db:push with proper Prisma migrations

Sprint 5 — Regulator readiness

2+ weeks

Wire AUDIT_SIGNING_KEY to AuditLog hash-chain (B11 completion)
Implement retention executor + legal-hold enforcement
Persist FX rate evidence with each transfer
Server-derived MiCA isUnhosted flag

Optimization Opportunities (for your CTO, beyond blockers)

De-duplicate v2 POST transfers — re-implements 540 LOC that exists in transfer-workflow.ts. PATCH route already migrated.
CI integration job spins up Postgres+Redis to run 2 test files — invest more or simplify.
Coverage gate is currently meaningless — coverage.all: false excludes untested files. 1-line fix.
Server-actions-first is a strength — but the doc describes a REST product. 1-2 day reconciliation.
Single Fly process group — split web and worker for independent scaling.
Sentry session replay 1.0 captures customer PII — disable until DPIA approves.
Rate-limiter has DB fallback (Agent 2 correction) — limiter itself is sound; fix the auth-route catch.
Test/Live env isolation has real teeth — API-key environment binding closes the env-switch attack. Keep intact through any refactor.

Co-CTO Conditions (Verbatim)

No external traffic until tenancy/ownership is modeled and enforced, Veriscope inbound authentication/body limits are fixed, compliance checks gate exchange before IVMS release, destructive customer actions honor legal hold and correct scopes, RBAC is enforced on real routes/admin pages, outbound webhook SSRF protection is applied, crypto-proof validation fails closed, audit logging is immutable and complete, and key rotation covers every encrypted entity class with recovery tests.

— Codex GPT-5.5 (R3 binding)

B10 tenant model + B1/B12/B15 fixes + B3 inbound hardening required before any prod traffic or on-chain data.

— Grok 4.3 (R3 binding)

Closing — re-evaluated post-red-team

ATRSA is technically impressive for one engineer. The defects are not laziness — they are the architectural drift that happens when one person writes 125K LOC over months without a second pair of eyes. However, the multi-reviewer convergence surfaced systemic gaps that the engineer's own FEATURES_STATUS.md does NOT acknowledge:

The product has NO multi-tenant ownership model (B10)
The audit-log integrity story is documented but the code doesn't deliver (B11)
The encryption rotation script will silently destroy customer data on first rotation (B15)
The Veriscope outbound dispatcher can be tricked into exfiltrating IVMS to AWS metadata (B12)

These are not polish items. They are structural decisions that need a 1-2-week sprint to fix correctly before any external user touches the system. The good news: most fixes are well-scoped. The bad news: B10 requires a schema change that ripples through every repository.

Audit v2 produced 2026-05-20 · Co-CTO convergence: Codex GPT-5.5 + Grok 4.3 + 11 other reviewers · 21 blockers bound · NO-GO
Files at research/tracks/atrsa/audit-2026-05-20/ · v1 superseded by this report

ATRSA Production-Readiness Audit

Verdict: NO-GO

Reviewer Panel

The 21 Bound Blockers

CRITICAL (12)

HIGH (8)

MEDIUM (1)

Co-CTO Top Priorities

Sprint Roadmap (revised v2)

Sprint 0 — Hold deploy

Sprint 1 — Stop the bleeding (12 CRITICAL)

Sprint 2 — Cover the cliff

Sprint 3 — Compliance + Production hygiene (8 HIGH)

Sprint 4 — Architectural cleanup

Sprint 5 — Regulator readiness

Optimization Opportunities (for your CTO, beyond blockers)

Co-CTO Conditions (Verbatim)

Closing — re-evaluated post-red-team