Context Map — Audit Trail Platform¶

The context map is the single, authoritative view of how the Audit Trail Platform is partitioned into bounded contexts and how those contexts collaborate. It makes responsibilities and seams explicit—showing ownership, upstream/downstream dependencies, collaboration styles, contract touchpoints, and tenancy/security boundaries across Audit.Gateway, Audit.Ingestion, Audit.Policy, Audit.Integrity, Audit.Projection, Audit.Query, Audit.Search, Audit.Export, and Audit.Admin. Use it to align design, reviews, and incident triage.

This page is written for platform and domain architects, microservice owners, SRE/operations, and security/compliance reviewers (and is a quick on-ramp for new contributors). Start with “Bounded Contexts (at a glance)” to learn what each context owns, then follow labeled edges to see collaboration styles and jump into contracts/events for exact schemas.

Bounded Contexts (at a glance)¶

The table captures responsibility, I/O surfaces, persistence anchors, and how tenant isolation is enforced. One-line Published Language entries define the canonical nouns/verbs used inside each context.

Context	Responsibility	Primary Interfaces (OHS/Contracts)	Persistence (authoritative)	Tenancy notes
Audit.Gateway	Front-door for append/query; enforces authn/z, rate limits, edition/feature checks; normalizes requests and emits intent.	HTTP OHS: `/audit/append`, `/audit/query` · Async out: `audit.append` (intent) · Policies consulted per call.	Stateless; relies on downstream.	Per-tenant authentication + edition gates; injects `TenantId`, `CorrelationId`; rejects cross-tenant access.
Audit.Ingestion	Validates and accepts append requests; deduplicates via idempotency keys; persists canonical audit entries.	Async in: `audit.append` · HTTP/gRPC internal: `AcceptAppend`, `Replay` · Async out: `audit.appended`.	Append-only event store (immutability), durable queue for staging.	Partitioned by `TenantId`; idempotency scope `{TenantId, Source, Key}`; strict RLS on store.
Audit.Policy	Centralizes policy: retention, PII classification/redaction, schema/edition gates; “can I append/query?” decisions.	gRPC/HTTP: `CheckPolicy`, `GetRetention`, `Classify` · Async: `policy.changed`.	Config DB for policies + KV cache; policy versioning tracked.	Policies scoped by tenant and edition; defaults at platform → overridden per tenant.
Audit.Integrity	Produces tamper-evidence (hash chains, Merkle roots, anchors); verifies integrity on demand; evidence ledger.	Async in: `audit.appended` · HTTP/gRPC: `VerifyEntry`, `VerifyRange`, `GetEvidence`.	Evidence ledger (hash chain/Merkle) + anchor journal (e.g., periodic root).	Evidence keyed by `TenantId`; cross-tenant proofs disallowed; verification requires tenant context.
Audit.Projection	Builds/maintains read models/materialized views for fast query; handles rebuild/replay.	Async in: `audit.appended`, `policy.changed` · gRPC internal: `Rebuild`, `Checkpoint`.	Read models (per-tenant tables) + checkpoint store.	Per-tenant physical/logical partitioning; rebuilds isolated per tenant; back-pressure aware.
Audit.Query	Query and retrieval over projections; pagination, filtering, joins with evidence on demand.	HTTP/gRPC OHS: `/audit/search`, `/audit/{id}`, `/audit/verify` · Async out: request→Export.	Query DB (authoritative for reads) referencing projections; optional cache.	Server-enforced tenant filters; ABAC claims (tenant, role, scope); no cross-tenant joins.
Audit.Search	Full-text and faceted search over audit content; powers free-text and advanced filters.	HTTP internal: `SearchIndex`, `Suggest` · Async in: `audit.appended` · Async out: `search.indexed`.	Search index (per-tenant index/partition) + queue for indexing.	Separate index per tenant or partitioned by `TenantId`; query-time filters enforced.
Audit.Export	Asynchronous export of query results (CSV/JSON/Parquet); packaging, signing, delivery & callbacks.	HTTP OHS: `/export/jobs` · Async in: export requests · Webhook: `export.completed` (signed).	Staging store + object storage for artifacts; job metadata DB.	Exports scoped to requesting tenant; artifacts stored in tenant bucket/prefix; time-bound signed URLs.
Audit.Admin	Tenant onboarding, keys/credentials, contract catalogs, schema registry pointers, operational toggles.	HTTP OHS: `/admin/tenants`, `/admin/contracts`, `/admin/policies` · Async out: `tenant.updated`.	Admin DB for tenants, client apps, contract metadata.	Tenant registry is authoritative; ensures residency/region and edition applied across contexts.

Published Language — one-liners¶

Audit.Gateway — “Append entries; submit queries; enforce edition/tenant guardrails.”
Audit.Ingestion — “Accept intent; deduplicate by idempotency key; persist canonical AuditEntry.”
Audit.Policy — “Classify fields; decide (allow/deny); retain/expire by RetentionPolicy; redact by DataClass.”
Audit.Integrity — “Hash entries; link chains; anchor roots; verify evidence for a ProofRange.”
Audit.Projection — “Project events; materialize views; checkpoint progress; rebuild deterministically.”
Audit.Query — “Filter and retrieve AuditRecords; paginate; optionally verify on read.”
Audit.Search — “Index content; tokenize; rank and facet results for SearchQuery.”
Audit.Export — “Package results; sign artifacts; deliver via ExportJob with callback.”
Audit.Admin — “Onboard tenants; register contracts; configure keys and govern editions.”

Collaboration Styles per Edge¶

Mini-catalog (with ATP examples)¶

Open Host Service (OHS) — A well-documented API a context exposes to the world.
Example: Audit.Query offers /audit/search for read models via Gateway.
Published Language (PL) — A shared, versioned vocabulary/schema used between contexts.
Example: Audit.Admin publishes TenantUpdated with canonical fields (tenantId, edition, region).
Customer–Supplier (CS) — Downstream (customer) drives expectations; upstream (supplier) commits to meet them.
Example: Audit.Query (customer) asks Audit.Integrity (supplier) to verify evidence.
Conformist (CONF) — Consumer voluntarily adopts the upstream model to reduce translation/latency.
Example: Audit.Gateway conforms to Audit.Query’s request/response shapes.
ACL / Anti-Corruption Layer (ACL) — Translation shield to isolate domain model from foreign one.
Example: External publishers → Audit.Gateway (adapters normalize into AppendIntent).
Event Choreography (CHOREO) — Asynchronous collaboration via events; no central orchestrator.
Example: Audit.Ingestion emits audit.appended; Integrity, Projection, and Search react.

Rule of one label per edge. Some edges mix interface and posture (e.g., OHS + Conformist). For clarity, we assign a single canonical style per edge below and reflect the dominant characteristic on the diagram label.

Canonical styles per pair (with rationale)¶

From → To	Style	Rationale (1-liner)
Gateway → Ingestion	CS	Gateway depends on Ingestion to accept intents and meet throughput/latency expectations.
Gateway → Query	CONF	Gateway adopts Query’s request/response shapes to stay thin and avoid translation.
Gateway → Export	CONF	Gateway forwards job creation using Export’s native API contract.
Ingestion → Policy	CONF	Ingestion conforms to Policy’s decision interface to make synchronous “allow/deny/redact” cheap.
Ingestion → Integrity	CHOREO	Integrity passively reacts to `audit.appended` to build hash chains without coupling.
Ingestion → Projection	CHOREO	Projection updates read models on `audit.appended`, enabling replay/rebuild.
Ingestion → Search	CHOREO	Search indexing is event-driven for elasticity and back-pressure handling.
Policy → Projection	CHOREO	Policy changes (`policy.changed`) drive projection rebuilds without sync coupling.
Query → Integrity	CS	Query requests on-read verification; Integrity commits to provide proofs.
Query → Export	CS	Query initiates long-running export jobs; Export provides job lifecycle guarantees.
Admin → Policy	PL	Admin publishes tenant/policy state in a canonical schema Policy consumes.
Admin → Gateway	PL	Admin’s tenant/edition updates propagate as canonical events that Gateway understands.
External Systems → Gateway	ACL	Gateway shields core domain by translating foreign payloads into `AppendIntent`.

Mermaid diagram (edge labels show the canonical style)¶

graph LR
  %% Legend: CS=Customer–Supplier, CONF=Conformist, ACL=Anti-Corruption Layer, CHOREO=Event Choreography, OHS=Open Host Service, PL=Published Language

  subgraph Clients
    U[API Clients / SDKs]
    EXT[External Systems]
  end

  subgraph Audit Trail Platform
    GW[Audit.Gateway]
    ING[Audit.Ingestion]
    POL[Audit.Policy]
    INT[Audit.Integrity]
    PRJ[Audit.Projection]
    QRY[Audit.Query]
    SRCH[Audit.Search]
    EXP[Audit.Export]
    ADM[Audit.Admin]
  end

  %% Client to Gateway (public API surface)
  U -->|OHS| GW

  %% Gateway collaborations
  GW -->|CS| ING
  GW -->|CONF| QRY
  GW -->|CONF| EXP

  %% Ingestion collaborations
  ING -->|CONF| POL
  ING -->|CHOREO| INT
  ING -->|CHOREO| PRJ
  ING -->|CHOREO| SRCH

  %% Policy broadcasts
  POL -->|CHOREO| PRJ

  %% Query collaborations
  QRY -->|CS| INT
  QRY -->|CS| EXP

  %% Admin publications
  ADM -->|PL| POL
  ADM -->|PL| GW

  %% External integrations
  EXT -->|ACL| GW

Hold "Alt" / "Option" to enable pan & zoom

Notes

When an edge is CONF, the consumer follows the supplier’s contract as-is; API shape lives with the supplier’s OHS and is versioned there.
CHOREO edges imply replayability and idempotency requirements on consumers; see “Reliability Notes” for DLQ/back-pressure specifics.
PL edges point to canonical schemas (tenant, edition, residency) owned by Audit.Admin; versioning is additive-first with deprecation windows.

Upstream/Downstream Matrix & Criticality¶

This matrix makes directionality and dependency explicit for each context and tags the operational criticality of every edge.

Legend:

⚡ LS — latency-sensitive (inline call on hot path) 📦 TS — throughput-sensitive (sustained high volume) ⏳ BA — batch/async (queued or long-running)

Context	Upstream (depends on)	Downstream (depends on it)	Critical contracts (examples)
Audit.Gateway	Audit.Admin — PL ⏳	Audit.Ingestion — CS 📦; Audit.Query — CONF ⚡; Audit.Export — CONF ⚡	HTTP OHS: `/audit/append`, `/audit/query`, `/export/jobs`; async out: `audit.append` (intent); consumes `tenant.updated` (PL).
Audit.Ingestion	Audit.Gateway — CS 📦; Audit.Policy — CONF ⚡	Audit.Integrity — CHOREO ⏳; Audit.Projection — CHOREO ⏳; Audit.Search — CHOREO ⏳	Async in: `audit.append`; sync: `CheckPolicy` (gRPC/HTTP); async out: `audit.appended`.
Audit.Policy	Audit.Admin — PL ⏳	Audit.Ingestion — CONF ⚡; Audit.Projection — CHOREO ⏳	Sync: `CheckPolicy`, `GetRetention`, `Classify`; async out: `policy.changed`.
Audit.Integrity	Audit.Ingestion — CHOREO ⏳	Audit.Query — CS ⚡	Async in: `audit.appended`; HTTP/gRPC: `VerifyEntry`, `VerifyRange`, `GetEvidence`.
Audit.Projection	Audit.Ingestion — CHOREO ⏳; Audit.Policy — CHOREO ⏳	Audit.Query — (reads projections) ⚡	Async in: `audit.appended`, `policy.changed`; internal: `Rebuild`, `Checkpoint`; read models/tables (authoritative for reads).
Audit.Query	Audit.Projection — (read models) ⚡; Audit.Integrity — CS ⚡	Audit.Gateway — OHS ⚡; Audit.Export — CS ⏳	HTTP/gRPC OHS: `/audit/search`, `/audit/{id}`, `/audit/verify`; sync to Integrity on-read; async/sync to Export to start jobs.
Audit.Search	Audit.Ingestion — CHOREO ⏳	Audit.Query — OHS ⚡	Async in: `audit.appended`; HTTP internal: `SearchIndex`, `Suggest`; async out: `search.indexed`.
Audit.Export	Audit.Query — CS ⏳	Audit.Gateway — OHS ⚡; Webhook recipients — Webhook ⏳	HTTP OHS: `/export/jobs`, `/export/jobs/{id}`; webhook: `export.completed` (signed); artifacts in object storage.
Audit.Admin	—	Audit.Policy — PL ⏳; Audit.Gateway — PL ⏳	HTTP OHS: `/admin/tenants`, `/admin/contracts`, `/admin/policies`; async out: `tenant.updated`, `contract.updated`.

Operational guidance.
• Treat ⚡ LS edges as part of your p95/p99 SLO budgets (timeouts, retries, circuit breakers).
• For 📦 TS edges, prefer queue/bulk APIs, shard keys, and idempotency; measure drain rates.
• ⏳ BA edges must have DLQ/retry/replay documented (see “Reliability Notes: Hot Paths & Recovery”).

Contracts & Events: Where to Look¶

Source of truth for contracts lives under docs/domain/contracts/ and event semantics under docs/domain/events-catalog.md. This section only pins touchpoints so you can jump to the exact files.

Audit.Gateway¶

HTTP (OHS)
- POST /audit/append → contracts/gateway/http/append.v1.md
- POST /audit/query → contracts/gateway/http/query.v1.md
Async (Intent topic)
- audit.append → events-catalog.md#auditappend-intent
Notes
- Normalizes external payloads into AppendIntent (see ACL adapter stubs: contracts/gateway/acl/).

Audit.Ingestion¶

gRPC/HTTP (internal)
- AcceptAppend, Replay → contracts/ingestion/grpc/acceptappend.v1.proto
Async (domain events)
- audit.appended.v1 → events-catalog.md#auditappended
Idempotency
- Key = {tenantId, source, idempotencyKey} (see header/metadata spec: contracts/shared/idempotency.md).

Audit.Policy¶

gRPC/HTTP (decisions/config)
- CheckPolicy, GetRetention, Classify → contracts/policy/grpc/checkpolicy.v1.proto
Async
- policy.changed.v1 → events-catalog.md#policychanged

Audit.Integrity¶

HTTP/gRPC (verification)
- GET /integrity/entries/{id}/verify
- POST /integrity/ranges/verify → contracts/integrity/http/verify.v1.md
Async
- Consumes audit.appended.v1 (build chains) → events-catalog.md#auditappended

Audit.Projection¶

Async (builders)
- Consumes audit.appended.v1, policy.changed.v1
Internal ops
- Rebuild, Checkpoint → contracts/projection/internal/rebuild.v1.md

Audit.Query¶

HTTP/gRPC (OHS)
- POST /audit/search
- GET /audit/records/{id}
- POST /audit/verify → contracts/query/http/search.v1.md
Async (export trigger)
- Emits export.requested.v1 (optional) → events-catalog.md#exportrequested

Audit.Search¶

HTTP (internal)
- POST /search/index
- GET /search/suggest → contracts/search/http/index.v1.md
Async
- Consumes audit.appended.v1; emits search.indexed.v1 → events-catalog.md#searchindexed

Audit.Export¶

HTTP (OHS)
- POST /export/jobs
- GET /export/jobs/{id} → contracts/export/http/jobs.v1.md
Webhooks (signed)
- export.completed.v1 (HMAC-SHA256 over payload; header X-Export-Signature) → contracts/export/webhooks/export.completed.v1.md

Audit.Admin¶

HTTP (OHS)
- /admin/tenants, /admin/contracts, /admin/policies → contracts/admin/http/
Async (PL events)
- tenant.updated.v1, contract.updated.v1 → events-catalog.md#tenantupdated

Versioning & Discovery (watermark)¶

HTTP/gRPC: SemVer in Accept (e.g., application/vnd.connectsoft.audit.search+json;v=1) and/or path (/v1/...). Additive-first; breaking changes → new vN surface.
Events/Topics: Subject suffix .vN (e.g., audit.appended.v1). Schema ID carried in metadata (schemaId, schemaHash).
Deprecation windows: Minimum 180 days; both vN and vN+1 live in parallel; announce via contract.updated (Admin).
Registry & “current” pointers:
- Index: contracts/index.md lists all surfaces by context.
- Machine-readable: contracts/registry.json exposes latest versions and schema IDs.
- Event catalog: events-catalog.md is normative for names, required fields, and semantics.

Always link specs from code repos to these canonical files. If you must drift, open an ADR and reference it in Evolution & ADR Links.

Contracts & Events: Where to Look¶

Source of truth for contracts lives under docs/domain/contracts/ and event semantics under docs/domain/events-catalog.md. This section only pins touchpoints so you can jump to the exact files.

Audit.Gateway¶

HTTP (OHS)
- POST /audit/append → contracts/gateway/http/append.v1.md
- POST /audit/query → contracts/gateway/http/query.v1.md
Async (Intent topic)
- audit.append → events-catalog.md#auditappend-intent
Notes
- Normalizes external payloads into AppendIntent (see ACL adapter stubs: contracts/gateway/acl/).

Audit.Ingestion¶

gRPC/HTTP (internal)
- AcceptAppend, Replay → contracts/ingestion/grpc/acceptappend.v1.proto
Async (domain events)
- audit.appended.v1 → events-catalog.md#auditappended
Idempotency
- Key = {tenantId, source, idempotencyKey} (see header/metadata spec: contracts/shared/idempotency.md).

Audit.Policy¶

gRPC/HTTP (decisions/config)
- CheckPolicy, GetRetention, Classify → contracts/policy/grpc/checkpolicy.v1.proto
Async
- policy.changed.v1 → events-catalog.md#policychanged

Audit.Integrity¶

HTTP/gRPC (verification)
- GET /integrity/entries/{id}/verify
- POST /integrity/ranges/verify → contracts/integrity/http/verify.v1.md
Async
- Consumes audit.appended.v1 (build chains) → events-catalog.md#auditappended

Audit.Projection¶

Async (builders)
- Consumes audit.appended.v1, policy.changed.v1
Internal ops
- Rebuild, Checkpoint → contracts/projection/internal/rebuild.v1.md

Audit.Query¶

HTTP/gRPC (OHS)
- POST /audit/search
- GET /audit/records/{id}
- POST /audit/verify → contracts/query/http/search.v1.md
Async (export trigger)
- Emits export.requested.v1 (optional) → events-catalog.md#exportrequested

Audit.Search¶

HTTP (internal)
- POST /search/index
- GET /search/suggest → contracts/search/http/index.v1.md
Async
- Consumes audit.appended.v1; emits search.indexed.v1 → events-catalog.md#searchindexed

Audit.Export¶

HTTP (OHS)
- POST /export/jobs
- GET /export/jobs/{id} → contracts/export/http/jobs.v1.md
Webhooks (signed)
- export.completed.v1 (HMAC-SHA256 over payload; header X-Export-Signature) → contracts/export/webhooks/export.completed.v1.md

Audit.Admin¶

HTTP (OHS)
- /admin/tenants, /admin/contracts, /admin/policies → contracts/admin/http/
Async (PL events)
- tenant.updated.v1, contract.updated.v1 → events-catalog.md#tenantupdated

Versioning & Discovery (watermark)¶

HTTP/gRPC: SemVer in Accept (e.g., application/vnd.connectsoft.audit.search+json;v=1) and/or path (/v1/...). Additive-first; breaking changes → new vN surface.
Events/Topics: Subject suffix .vN (e.g., audit.appended.v1). Schema ID carried in metadata (schemaId, schemaHash).
Deprecation windows: Minimum 180 days; both vN and vN+1 live in parallel; announce via contract.updated (Admin).
Registry & “current” pointers:
- Index: contracts/index.md lists all surfaces by context.
- Machine-readable: contracts/registry.json exposes latest versions and schema IDs.
- Event catalog: events-catalog.md is normative for names, required fields, and semantics.

Always link specs from code repos to these canonical files. If you must drift, open an ADR and reference it in Evolution & ADR Links.

Tenancy & Ownership¶

Each context declares what it owns (authoritative sources of truth) and what it derives (indices, projections, caches). Tenant isolation is enforced end-to-end via keys, partitioning, and RLS/filters.

Context	Owns (authoritative)	Indices / Projections (derived)	Tenant keying (partitioning / RLS / filters)	Cross-tenant rules
Audit.Gateway	None (stateless for business data). May persist AccessLog/RateLimit counters (operational).	—	All inbound calls must carry `TenantId`; Gateway injects `TenantId` & `CorrelationId` downstream; rejects missing/mismatched tenant claims.	Disallowed. Gateway enforces per-request tenant scoping; no cross-tenant fan-out.
Audit.Ingestion	AuditEntry (append-only, immutable), AppendReceipt (ack metadata).	Staging queue only (transient).	Physical/logical partition by `TenantId`; RLS on event store; idempotency scope `{TenantId, ProducerAppId, Source, IdempotencyKey}`.	Disallowed. Replay & rebuild are tenant-scoped jobs.
Audit.Policy	PolicyDefinition, RetentionPolicy, ClassificationPolicy (incl. versions).	KV/cache of compiled policies per tenant/version.	Policies keyed `{TenantId, Edition}` with platform defaults and per-tenant overrides; reads require tenant match.	No cross-tenant decisions; multi-tenant policy reads restricted to platform admins (read-only).
Audit.Integrity	EvidenceLink, ChainSegment, MerkleRootAnchor.	Verification caches (proof memoization) per tenant.	Separate chain per `TenantId` (and optionally per region/namespace); proof queries require matching `TenantId`.	Disallowed. No proofs across tenants; anchors never mix tenants.
Audit.Projection	none (does not create new business truth).	AuditRecordView (per-tenant tables), PolicyView; Checkpoint state.	Per-tenant schemas or table partition by `TenantId`; RLS on all views; rebuild jobs are tenant-scoped.	Disallowed. Aggregates and joins are tenant-bounded.
Audit.Query	No business truth; may own QueryAuditLog (requests), AccessAudit.	Uses Projection as source of truth for reads.	All queries require tenant filters injected server-side; ABAC checks on roles/scopes.	Disallowed by default. Exception: platform auditors with explicit `platform-admin` scope and “multi-tenant read” feature flag (read-only).
Audit.Search	No business truth.	AuditSearchDoc (per-tenant index/partition), suggest dictionaries.	Dedicated index per tenant or partition key `TenantId`; query-time tenant filter enforced.	Disallowed. No cross-tenant search indices or queries.
Audit.Export	ExportJob, ExportArtifact (metadata).	Temporary assembly areas; signed URLs; delivery receipts.	Artifact storage path `tenants/{TenantId}/exports/{JobId}`; keys/secrets resolved per tenant; webhook signing keys per tenant.	Disallowed. Exports cannot include records from multiple tenants.
Audit.Admin	Tenant registry, Edition mapping, ClientApp credentials, ContractDescriptor.	Catalog caches for quick lookup.	`TenantId` is authoritative here; residency/region and edition enforced at provisioning time for downstream stores.	Limited to platform staff; writes are tenant-scoped; reads across tenants only for admin console.

Idempotency & Correlation (ingest/query requirements)¶

Idempotency (Append)
- Header/metadata: X-Idempotency-Key (opaque), plus producer metadata ProducerAppId, Source.
- Scope: {TenantId, ProducerAppId, Source, IdempotencyKey}.
- Retention: configurable; default ≥ 7 days to survive network retries and DLQ replays.
- Behavior: On duplicate, Ingestion returns the original AppendReceipt (HTTP 200) without writing a new AuditEntry.
Correlation & Tracing
- Accept X-Correlation-Id (sticky across contexts) and W3C traceparent (OpenTelemetry).
- Gateway creates values if missing, propagates via event metadata (correlationId, traceId) and request headers.
- All domain events (audit.appended, policy.changed, etc.) must include {tenantId, correlationId, causationId?, traceId}.
Tenant Header & Claims
- X-Tenant-Id (or embedded in client credentials) is mandatory at Gateway; downstream services do not trust caller-supplied tenant values—Gateway signs/enriches them.

Residency & Data Locality (enforced by Admin)¶

Region binding (per tenant) drives storage/account selection for Ingestion store, Integrity ledger, Projection DB, Search index, and Export bucket/prefix.
Cross-region moves require Admin workflow (deactivate → migrate → reactivate) and explicit ADR.

Rule of thumb: If a context “creates original business facts,” it owns them; everything else is derived and must be reproducible from owned facts + policies.

Security Overlays (Zero-Trust)¶

We treat every hop as hostile by default. Controls are enforced at the edge (Gateway), inside the mesh (service→service), and at data layers. Workloads authenticate with workload identity; traffic is mTLS; requests are authorized with ABAC/RBAC and tenant/edition guards; sensitive fields are classified/redacted at policy checkpoints.

Overlay diagram¶


flowchart LR
  subgraph EDGE["Edge / Public Ingress"]
    C[Clients/SDKs]
  end

  subgraph GZ["Gateway Zone (mTLS ingress)"]
    GW[Audit.Gateway<br />AuthN: OAuth2/JWT<br />AuthZ: ABAC/RBAC<br />Rate limit<br />Schema & edition gates]
  end

  subgraph MESH["Service Mesh / mTLS"]
    ING[Audit.Ingestion<br />Idempotency + tenancy inject]
    POL[Audit.Policy<br />Classification & decisions]
    INT[Audit.Integrity<br />Hash/Merkle proofs]
    PRJ[Audit.Projection]
    QRY[Audit.Query]
    SRCH[Audit.Search]
    EXP[Audit.Export<br />Signed webhooks]
    ADM[Audit.Admin<br />Tenant/Edition registry]
  end

  C ---|TLS| GW
  GW ---|mTLS + ABAC| ING
  ING ---|mTLS - CONF| POL
  ING -. events (signed, tenant-scoped) .-> INT
  ING -. events (signed, tenant-scoped) .-> PRJ
  ING -. events (signed, tenant-scoped) .-> SRCH
  QRY ---|mTLS - CS| INT
  QRY ---|mTLS - CS| EXP
  ADM -. PL events .-> GW
  ADM -. PL events .-> POL

  subgraph DATA["KMS-protected Data Layers"]
    ES[(Event Store / Ledger)]
    RM[(Read Models)]
    IDX[(Search Index)]
    OBJ[(Object Storage — Exports)]
  end

  ING --- ES
  PRJ --- RM
  SRCH --- IDX
  EXP --- OBJ

Hold "Alt" / "Option" to enable pan & zoom

Boundary policies (who enforces what)¶

Client → Gateway (public edge)

Transport: TLS 1.2+; HSTS at edge; strict ALPN/ciphers.
AuthN: OAuth2/JWT (aud/iss/exp/nbf checked); optional mTLS for partner apps.
AuthZ: ABAC (tenant, roles/scopes, edition features).
Tenancy: Gateway injects TenantId, rejects cross-tenant hints; correlates X-Correlation-Id.
Validation: JSON schema & contract version; edition gates; size limits; content scanning (optional).
Rate limiting: Token-bucket per tenant + client with burst/steady; 429 + Retry-After.
PII hooks: Request models pre-classified; drop disallowed fields before emit.

Gateway → Ingestion (hot path)

mTLS + workload identity: service-to-service with SPIFFE/SPIRE or AAD Workload Identity.
AuthZ: ABAC check (tenant/edition) survives hop via signed claims.
Idempotency: Required X-Idempotency-Key; scope {TenantId, ProducerAppId, Source, Key}.
Observability: W3C traceparent propagated; structured audit log.

Ingestion ⇢ {Integrity, Projection, Search} (event edges)

Transport: Broker auth (SAS/AAD), topic-level ACLs; mTLS where supported.
Envelope: Events signed (producer key id + hash) and include {tenantId, schemaId, correlationId}.
PII: Fields already classified/redacted by Policy at accept time.
Replay/DLQ: Tenant-scoped DLQ; poison-pill quarantine; ordered replays per partition key.

Ingestion → Policy (decision call)

mTLS + workload identity; timeout budget small (latency-sensitive).
Cache: Negative/positive decision caching with TTL; eTags/version.

Query → Integrity (on-read verify)

mTLS; AuthZ requires same-tenant proof requests.
Proofs: Range proofs returned with evidenceHash, chainId, window.

Query → Export; Export → Webhook recipients

Export API: mTLS; ABAC on job scope; encryption conf set by tenant policy.
Artifacts: KMS envelope encryption; per-tenant bucket/prefix.
Webhooks: HMAC-SHA256 signature over canonical payload; headers X-Export-Signature, X-Export-Timestamp; 5-min skew window; retries with exponential backoff.

Admin → {Gateway, Policy} (PL events)

Publisher: Admin signs tenant.updated, contract.updated; consumers verify signature + version.
Controls: Edition/residency changes require multi-party approval (break-glass logged).

Mandatory controls matrix¶

Area	Control	Enforced by
Transport	TLS at edge; mTLS in mesh	Ingress/Gateway, Mesh/Sidecars
Workload identity	SPIFFE/SPIRE or AAD Workload Identity; no static secrets in pods	Platform IAM
AuthN/AuthZ	OAuth2/JWT (clients); ABAC/RBAC (services)	Gateway, Services
Tenancy	Server-enforced tenant filter; signed tenant claims	Gateway, All services
PII/Classif.	Policy classifies; Ingestion applies redaction before persist	Policy, Ingestion
Rate limiting	Per-tenant/client; separate write vs read buckets	Gateway
Data-at-rest	KMS keys per store (event store, read models, search, exports)	Platform/KMS
Secrets	Central vault; short-lived tokens; no inline secrets	Platform
Logging	Structured, redacted logs; no PII beyond policy	All services
Integrity	Hash chains, Merkle roots, anchored periodically	Integrity

Notes & defaults¶

Default-deny on network and IAM. Only declared edges are allowed.
Additive-first versioning; breaking changes require new vN and ADR.
Residency and region enforced by Admin at provisioning; data paths are tenant-scoped.
Back-pressure policies (429/deferral/DLQ) must not leak cross-tenant timing channels.

See also: Tenancy & Ownership for partitioning/RLS, and Reliability Notes for DLQ/replay guarantees.

Reliability Notes: Hot Paths & Recovery¶

This section names the golden paths, sets p95 targets, and documents back-pressure + replay entry points so SREs and service owners share the same operational contract.

SLIs & scope¶

Availability (per OHS): successful responses / total, excluding client 4xx (except 429).
Latency (p95) measured server-side within the same region; excludes client/network RTT.
Durability of append: an append is “accepted” once persisted to the Ingestion event store; downstream consumers are eventually consistent.

Golden Path A — Append → Accept → (event) Project/Search/Integrity¶

User intent: client app appends an audit entry.

Targets

Gateway POST /audit/append (sync mode): p95 ≤ 150 ms, p99 ≤ 300 ms.
If Gateway sheds to async mode (intent enqueue): p95 ≤ 60 ms for 202 Ack; Ingestion accept within p95 ≤ 2 s end-to-end (queue to persist).

Back-pressure & controls

Gateway: per-tenant/token buckets; on breach → 429 with Retry-After. Adaptive mode: switches from sync to async (enqueue audit.append intent) when local queue > threshold.
Gateway → Ingestion (sync path): if Policy/Ingress budget exceeded → 503 with Retry-After; clients retry with same Idempotency-Key.
Ingestion: write throttles to event store; if pressure → stage to durable queue; consumers slowed via prefetch & concurrency limits.
Events (Ingestion → {Projection, Search, Integrity}): ASB DLQ after N attempts (default 10, exponential backoff); deferral for out-of-order ranges; tenant-partitioned subscriptions.

Replay / rebuild

Re-ingest (safe): Ingestion.Replay(tenantId, fromOffset, toOffset?) — idempotent by {TenantId, ProducerAppId, Source, IdempotencyKey}.
Projection: Projection.Rebuild(tenantId, checkpoint?) — drains from event store; checkpoints per tenant.
Search: Search.Reindex(tenantId, range?) — idempotent on documentId + version.
Integrity: Integrity.Repair(tenantId, gapRange) — recomputes chains and anchors; no cross-tenant proofs.

Observability - Required dimensions: {tenantId, route, mode(sync|async), status, idempotencyHit(bool)}; export drain rate and oldest message age for audit.append.

Golden Path B — Query → (read models) → (optional) Verify¶

User intent: list/filter audit records, optionally verify integrity on read.

Targets - POST /audit/search: p95 ≤ 250 ms for page sizes ≤ 100. - GET /audit/records/{id} + POST /audit/verify: p95 ≤ 200 ms for typical proof windows (≤ 1 k entries).

Back-pressure & controls - Gateway: per-tenant read buckets; expensive queries are shaped (server-side limits) or return 429 with guidance. - Query: protects read models with max rows / timeouts; cache hot filters; coalesce repeated requests by tenantId+hash(query) window. - Query → Integrity: if verify exceeds budget → return partial set + verificationPending=true (option), or 202 to async verify with callback.

Replay / rebuild - Source of truth is Projection; if gaps detected → trigger Projection.Rebuild(tenantId).
- Integrity verification can be deferred: VerifyRangeAsync(ticket); result cached by (tenantId, rangeHash).

Observability - Export SLIs: qps, hitRatio(cache), p95, timeouts, verifyRate, verifyLatency.

Golden Path C — Export (long-running)¶

User intent: export query results to CSV/JSON/Parquet with signed delivery.

Targets - POST /export/jobs: p95 ≤ 120 ms to accept & enqueue. - Job start SLA: p95 ≤ 60 s under steady load; completion depends on dataset size; progress exposed via /export/jobs/{id}.

Back-pressure & controls - Bounded concurrency per tenant; queue length SLO and age alerts. - If object storage throttles → exponential backoff; partial chunks checkpointed. - Webhook retries (HMAC signed) with jittered backoff; DLQ on receiver 4xx/5xx after budget.

Replay / rebuild - Export.Resume(jobId) — idempotent chunking; artifacts content-addressed. - Regenerate artifact: Export.Rerun(jobId) stores new version, previous retained by retention policy.

Observability

SLIs per tenant: queuedJobs, runningJobs, meanChunkTime, artifactSize, webhookFailures.

Golden Path D — Admin Updates (tenant/edition/policy)¶

User intent: update tenant/edition or policy; propagate safely.

Targets

POST /admin/policies or /admin/tenants: p95 ≤ 200 ms to persist and emit *.updated.
Downstream consumption SLO: p95 ≤ 30 s for Projection/Search to reflect policy changes.

Back-pressure & controls

Admin changes rate-limited; circuit prevents mass invalidation storms.
Consumers apply changes with version gating; if behind → queue defers until consistent.

Replay / rebuild

Admin.Rebroadcast(tenantId, type, version) seeds re-delivery.
Projection.Rebuild if classification/retention policy changed (ensures derived state consistency).

Observability

Track policy version adoption per service; alert on lag (e.g., > 2 versions).

Back-pressure & DLQ Matrix (summary)¶

Edge	Mechanism	Default policy
Client → Gateway	429 + `Retry-After`; adaptive degrade to async appends	Per-tenant token bucket; separate write/read buckets
Gateway ↔ Policy (sync)	503 + retry with jitter; small timeouts; circuit breaker	Budget 50 ms median; fallback = enqueue intent
Gateway → Ingestion (sync)	503 + retry w/ idempotency; or enqueue	Idempotent keys; drop duplicates by scope
Ingestion → Event Bus	Outbox + publish retry; handoff DLQ	Max attempts 10; DLQ w/ diagnostics and sample payload
Bus →	Consumer retry/backoff; DLQ/deferral	Tenant partition key; replay safe
Query → Integrity	Timeout budget; partial results or async verify	SLA ties to page size/proof window
Export → Webhook	Signed retries; DLQ after budget	6 attempts, exponential backoff, 5-minute max skew

Operator playbook pointers¶

Rebuild: Start with Projection, then Search, then re-issue VerifyRange if needed.
Drain: Pause producers for a tenant using Gateway admission control, then Replay from last good checkpoint.
Hot shard: Enable producer-side backoff; increase consumer concurrency for that partition only; avoid global scale-out first.

All paths rely on idempotency + tenant partitioning to make replay safe. If any component cannot guarantee this, file an ADR and link it in Evolution & ADR Links.

Evolution & ADR Links¶

We evolve edges additively-first and treat the context map as a governed contract. Any deviation or breaking move must carry an ADR.

Versioning & change rules¶

APIs (HTTP/gRPC)
- Allowed (non-breaking): add optional fields; add new endpoints; widen enum with default; increase limits with server-side caps.
- Breaking → new major vN: remove/rename fields; change semantics; tighten validation; change status codes.
- Surface major in path (/v2/...) or Accept header (v=2); run vN and vN+1 in parallel for ≥ 180 days.
Events/Topics
- Name as domain.event.vN; additive changes stay within vN; breaking changes → new subject .vN+1.
- Carry schemaId, schemaHash, tenantId, correlationId.
- Dual-publish during migrations; consumers opt-in per version.
Contracts registry
- Update contracts/index.md and contracts/registry.json with each change; bump SemVer.
- Emit contract.updated.v1 from Admin announcing availability/deprecation window.

Edge evolution policies (what’s allowed)¶

Add a new edge: allowed if tenancy, security, and SLO budget are documented (see checklist below).
Change edge style (e.g., CONF → ACL): require ADR with rationale (coupling, translation needs, upstream churn).
Remove an edge: only after window closes and all consumers are verified migrated (evidence via dashboards).
Raise criticality (e.g., BA → LS): needs capacity/SLO analysis and load test artifacts.
Residency/region impact: requires migration plan + data path verification; Admin gated.

When to introduce an ACL (decision cues)¶

Introduce an Anti-Corruption Layer if any of the following hold:

Upstream “Published Language” conflicts with our domain terms or policy model.
Upstream schema churns frequently or has weak versioning guarantees.
Security/classification requirements differ (need pre-ingest redaction/validation).
We need canary or translation for rollout without exposing internals.

If none apply and latency is critical, prefer Conformist.

“Propose a new edge” — PR checklist (copy into PR description)¶

Schema & Contracts
- Contract file(s) under docs/domain/contracts/... with examples and validation (JSON Schema/Proto).
- Event subject named x.y.vN; schema registered in contracts/registry.json.
- Compatibility statement (additive vs breaking) and deprecation plan (if replacing an existing edge).
Tenancy & Data
- tenantId propagation from Gateway; server-enforced filters/RLS noted.
- Idempotency scope defined (if write/append): {TenantId, ProducerAppId, Source, Key}.
- Residency/region path (storage/index/bucket prefix).
Security
- AuthN (OAuth2/JWT or workload identity) and AuthZ posture (ABAC/RBAC).
- PII classification hooks; redaction policy at accept time.
- KMS usage for any data-at-rest or artifact produced; secret source (Vault).
Observability
- W3C trace propagation (traceparent); correlation fields in events.
- Metrics: QPS, p95, error rate, DLQ age / drain rate; logs redaction noted.
- Dashboards/alerts updated (golden signals).
SLO & Reliability
- Proposed p95/p99; capacity estimate; back-pressure behavior (429/503, deferral).
- Retry policy, DLQ policy, replay entry points; idempotency verified.
Docs & Governance
- context-map.md edges/labels updated with style & criticality.
- events-catalog.md entry added/updated.
- Runbook link (rebuild/replay).
- ADR linked (see below).

ADRs for deviations (link examples)¶

docs/adr/2025-10-22-gw-query-conformist.md — Adopt Conformist on Gateway→Query to minimize latency.
docs/adr/2025-10-22-introduce-acl-external-publishers.md — Add ACL adapters at Gateway for external systems.
docs/adr/2025-10-22-event-versioning-v2-for-audit-appended.md — Promote audit.appended to v2 with new required field evidenceHint.

Tip: Use the ADR template (docs/adr/_template.md) and tag with context-map, contracts, security, tenancy.

Governance hooks (automation)¶

CI checks:
- Contract lint (schema validity), breaking-change detector (forbidden field deletes), link checker for docs.
- “Contracts registry up-to-date” gate; CODEOWNERS require sign-off from Domain, SRE, Security.
Release:
- Dual-run vN & vN+1; publish adoption dashboards; emit contract.updated.
- Set deprecation tombstones with removal date; auto-open tracking issue.

If a PR passes this section’s checklist and CI gates, reviewers can approve without additional architecture meetings.