Data Residency & Retention - Audit Trail Platform (ATP)¶
Place data where required, keep it only as long as needed, and prove both continuously.
Purpose & Scope¶
- Define residency controls for tenant data placement and movement across regions.
- Specify retention, deletion, and legal hold policies for audit evidence lifecycle.
- Aligns with ConnectSoft principles: security-first, policy-as-code, immutability, observability.
Applies to
- Data planes: Hot append store (event/evidence), Warm read models/search, Cold archives/exports.
- Control planes: Policy/guard engine, residency catalog, key management, legal holds, retention scheduler.
- All bounded contexts that persist or move audit evidence (Gateway, Ingestion, Integrity, Query, Export).
Out of scope
- Re-defining identity/authorization primitives (see security-compliance.md).
- Performance/SLO targets for specific services (see deployment-views.md).
- Domain entity semantics (see data-model.md).
Assurance
- All residency and retention decisions are logged, metrified, and provable (auditor-ready evidence packs).
Residency Model Overview¶
Residency defines where tenant data may live and how it may move. These primitives are enforced by guards on every write, read, export, replication, and migration.
Primitives (Glossary)¶
| Name | Type | Example | Purpose |
|---|---|---|---|
RegionCode |
Enum | US, EU, IL |
Jurisdiction umbrella for policy decisions. |
CloudRegion |
String | eastus, westeurope |
Cloud provider region used for placement. |
DataSiloId |
String/UUID | silo-7c1a… |
Logical partition binding storage/index/messaging to a tenant in-region. |
ResidencyProfileId |
String | gdpr-standard, us-hipaa |
Bundle of constraints: allowed regions, export routes, DR posture. |
SovereigntyTags |
Set | ["EEA","UK"] |
Hints for cross-border evaluation. |
CrossBorderRules |
Enum | deny, same_code, allow |
High-level constraint for cross-code flows. |
MovementMode |
Enum | none, replicate_read, migrate_once |
If/how data may replicate/migrate. |
FailoverPosture |
Enum | read_only, full, blocked |
What the platform may do during DR. |
ExportRoute |
Enum | in_region, same_code, global |
Where DSAR/eDiscovery may land. |
PolicyVersion |
SemVer | 2.3.0 |
Versioned policy for reproducible evaluations. |
Binding rule: each tenant has one authoritative CloudRegion per environment. Movement from that region is deny-by-default unless explicitly allowed by its
ResidencyProfileId.
Placement Strategy¶
- Write locality. All appends/writes occur in the tenant’s authoritative
CloudRegion. - In-region reads first. Read models/search indices are built per region; cross-region queries are blocked unless the profile allows it.
- Resource naming.
atp-<svc>-<env>-<cloudregion>(e.g.,atp-ingest-prod-westeurope) with tags:tenantId,regionCode,residencyProfileId. - Residency Catalog. Global, read-only map:
tenantId → {CloudRegion, RegionCode, ResidencyProfileId, DataSiloId}. - Jurisdiction-aware services. Gateway, Export, and Query services evaluate residency before routing, and stamp decisions (policy version, reason) into logs/metrics.
Movement Rules¶
- Replication.
replicate_readpermits read-only replicas in additional allowed regions within the sameRegionCodefamily (e.g., EU↔EU).- No cross-family replication without an explicit profile override and approval trail.
- Failover.
- On regional incident, default to
read_onlyin an approved replica region until regulatory confirmation promotes tofull. - All break-glass actions are time-bound, least-privilege, fully audited.
- On regional incident, default to
- Migration.
- Planned moves follow copy → verify → cutover → decommission.
- Proofs/manifests are re-anchored with provenance (old/new region, key IDs); originals retained per policy.
Deterministic Decision Order¶
- Resolve tenant binding → authoritative
CloudRegion. - Load
ResidencyProfileId→ allowed moves/exports & DR posture. - Evaluate operation type → write/read/export/replicate/migrate.
- Classify data category → evidence vs derived/read-model vs export.
- Check service/network guards → in-region placement, private links, deny-by-default egress.
- Produce result → allow / deny / quarantine, with policy version, reason, and evidence record.
flowchart TD
A[Incoming Operation] --> B{Operation Type?}
B -->|Write/Append| C[Resolve tenant binding → CloudRegion]
C --> D{Region allowed?}
D -->|No| X[DENY 403 • GuardViolation • Log+Metric]
D -->|Yes| E[Place in-region DataSilo • Tag lineage]
B -->|Read/Query| F[Check in-region availability]
F --> G{Cross-region allowed by profile?}
G -->|No| X
G -->|Yes| H[Route via approved replica • Log purpose]
B -->|Export/DSAR| I[Validate ExportRoute against profile]
I -->|Disallow| X
I -->|Allow| J[Produce signed manifest • Deliver]
Examples¶
Policy fragment (residency)
residency:
regionCode: EU
cloudRegion: westeurope
residencyProfileId: gdpr-standard
movement:
replicate: replicate_read
failoverPosture: read_only
exportRoute: in_region
policyVersion: 2.3.0
Access claims & headers (sketch)
Authorization: Bearer <jwt with tenant_id, region_code=EU, residency_profile=gdpr-standard>
X-Tenant-Id: <uuid>
X-Region-Code: EU
X-Correlation-Id: <uuid>
Regional Topologies & Placement¶
Design resources to be region-first and tenant-scoped, with a strict network perimeter. Control planes are regionally overlaid and discovered via a global catalog; data planes operate only within approved boundaries.
Storage & Indexing Topology¶
- Per-region partitions. Authoritative append stores, read-model indices, and archives are region-scoped (e.g., buckets/tables/indices per
CloudRegion), never shared across regions. - Tenant scoping. Every physical artifact is tagged with
tenantId,dataSiloId,regionCode, andresidencyProfileId. Access controls and queries must include these tags. - Hot/Warm/Cold layout.
- Hot (append/WORM): region-local event/evidence store.
- Warm (query/search): region-local projections and indices, rebuilt from hot when needed.
- Cold (archive/export): region-compliant object storage with lifecycle rules.
- No cross-region writes. Writes land only in the tenant's authoritative
CloudRegion. Cross-region read replicas exist only if policy allows (see Residency Model Overview).
Control Plane & Catalog¶
- Region overlays. Configuration is layered:
base → env → region. Region overlays (e.g.,values.westeurope.yaml) bind SKUs, quotas, encryption keys, and feature gates. - Global residency catalog. Read-only catalog maps
tenantId → {CloudRegion, RegionCode, ResidencyProfileId, DataSiloId}; mutated only by vetted admin workflows. Data-plane services cache and validate on each operation. - Policy distribution. Versioned policy bundles are pushed per region; services evaluate using the active policy version and stamp it into decision logs.
- Rollout safety. Blue/green or canary at the region edge (gateway/APIM/ingress) before enabling internals; control-plane drift detection blocks out-of-sync regions.
Network Boundaries & Egress¶
- VPC/VNet segmentation. Each region uses a dedicated VPC/VNet with default-deny east-west rules; inter-region peering is disabled by default.
- Private service endpoints. Databases, KMS/HSM, object storage, and search are exposed via Private Link/Endpoints only; no public IPs on stateful services.
- mTLS service mesh. East-west traffic terminates in the mesh; identities are SPKI-pinned; per-tenant
tenantIdandregionCodeare propagated as headers/claims. - Controlled egress. Centralized egress via NAT + allow-list; DNS is restricted to approved resolvers; outbound to external SaaS is gated by residency policy.
- Break-glass path. Time-boxed rules for incident response (read-only by default); every exception emits an auditable event and is auto-revoked on TTL expiry.
Resource Naming & Tagging¶
Naming convention
Required tags
| Key | Value example | Purpose |
|---|---|---|
tenantId |
b3f2… |
Authorization & data isolation |
dataSiloId |
silo-7c1a… |
Physical partition mapping |
regionCode |
EU |
Jurisdiction/sovereignty evaluation |
cloudRegion |
westeurope |
Placement and routing |
residencyProfileId |
gdpr-standard |
Policy evaluation |
policyVersion |
2.3.0 |
Evidence reproducibility |
Placement Examples¶
- Ingestion (Hot).
atp-ingest-<env>-<cloudregion>→ region-local append store; DLQ is region-local queue/topic. - Query (Warm).
atp-query-<env>-<cloudregion>→ region-local search/index; cross-region queries blocked unless policy allows. - Export (Cold).
atp-export-<env>-<cloudregion>→ region-local object storage with lifecycle → Glacier/Archive per policy. - Integrity.
atp-integrity-<env>-<cloudregion>→ region-local segment sealing; optional external timestamping endpoints are egress-allow-listed per region.
flowchart LR
subgraph EU[westeurope VNet]
GW[Gateway/APIM]-->ING[Ingestion Svc]
ING-->HOT[(Hot Append Store)]
QRY[Query Svc]-->IDX[(Search/Index)]
EXP[Export Svc]-->COLD[(Cold Object Storage)]
ING-.Private Link.->PL1[(DB/KV/HSM)]
end
subgraph US[eastus VNet]
GWU[Gateway/APIM]-->INGU[Ingestion Svc]
INGU-->HOTU[(Hot Append Store)]
end
EU <-. no peering by default .-> US
Guardrails (quick checklist)¶
- Regions must be independently operable; no hidden dependencies across regions.
- Any route leaving a region must pass policy evaluation, network allow-lists, and purpose/justification logging.
- Control-plane drift blocks deployments; data-plane rejects operations with missing/invalid tags.
- Public endpoints on stateful services are forbidden; egress is deny-by-default.
Replication & Disaster Recovery¶
Design for authoritative single-writer per tenant region, with carefully scoped replicas that respect residency. DR actions are policy-driven, audited, and reversible.
Durability & Replication Scope¶
- Intra-region durability. Multi-zone deployments for services and state; append stores use zone-redundant storage; queues/topics have DLQ and replay.
- Authoritative writer. Hot append (evidence) has one writer region per tenant; replicas never accept writes unless a controlled promotion occurs.
- Warm replicas. Read models/search indices may be rebuilt or replicated to other allowed regions for low-latency reads (policy-gated).
- Cold replicas. Archives/exports can be asynchronously copied to additional allowed regions for survivability, subject to residency profile.
Replication Modes¶
| Mode | Write posture | Read posture | Typical use | Notes |
|---|---|---|---|---|
| Intra-region HA | Single writer (region) | Local reads | All tiers | Zone redundancy; zero cross-region dependency. |
| Active/Passive | Single writer (primary) | Optional read on passive (policy) | Standard/Business | Async replication of cold (and optionally warm) to a passive region; failover requires promotion. |
| Active/Active (read) | Single writer (primary) | Read replicas in multiple regions | Enterprise | Hot remains single-writer; warm/cold are replicated to additional allowed regions. |
| Active/Active (full) | Multi-writer | Multi-region reads/writes | Not default | Only by exception where residency allows cross-border writes; requires conflict and legal basis controls. |
Cross-region replication is limited to regions allowed by the tenant’s Residency Profile (e.g., EU↔EU). Cross-family flows (EU↔US) are deny-by-default.
Residency-Safe Failover¶
- Default posture: enter read-only in the recovery region until regulatory confirmation authorizes write promotion.
- Promotion guardrails: time-bound approval, purpose justification, least-privilege roles, and full audit trail.
- Rollback ready: promotion creates a reversible change set; return to original region when healthy and allowed.
- Break-glass: emergency override path with automatic TTL and mandatory post-mortem.
Failover flow (sketch)
flowchart TD
A[Incident detected] --> B{Residency allows target region?}
B -- No --> X[Block • Incident continues in primary • Notify compliance]
B -- Yes --> C[Enter read-only in target]
C --> D{Regulatory approval?}
D -- No --> C
D -- Yes --> E[Promote writer • Update catalog]
E --> F[Stamp policy version • Emit dr.failover_completed]
RPO/RTO Targets by Tenant Tier (examples)¶
| Tier | RPO target | RTO target | Replication scope (default) |
|---|---|---|---|
| Standard | ≤ 30 minutes | ≤ 4 hours | Cold → passive; warm rebuilt on demand |
| Business | ≤ 15 minutes | ≤ 2 hours | Cold + selected warm → passive |
| Enterprise | ≤ 5 minutes | ≤ 30 minutes | Cold + warm to multiple read replicas |
Targets are per-edition defaults. Tenants may negotiate stricter objectives; stricter settings increase cost and require capacity planning.
Verification & Drills¶
- Lag SLO: alert if replication lag exceeds 50% of RPO for 2 consecutive intervals.
- Warm restore drills: monthly sample tenant/time-window restore from replicas; compare checksums/counts and re-anchor proofs if needed.
- Cold restore drills: quarterly region-level restore from archives; verify manifests, signatures, and replayability.
- Failover exercises: semiannual controlled failover/promotion in a sandbox or designated window; publish results.
Evidence pack (per drill)
- Run ID, policy version, regions, datasets, start/stop times.
- RPO/RTO achieved, replication lag histogram, errors/retries.
- Manifest of restored objects (counts, byte sizes, checksums, proof references).
- Approvals and operator actions (who/when/why).
Policy Fragment (example)¶
dr:
mode: active_passive
writerRegion: westeurope
replicas:
- region: northeurope
scope: [cold, warm] # hot remains single-writer
rpo: PT15M
rto: PT2H
failover:
defaultPosture: read_only
approvalRequired: true
ttl: PT4H # auto-revoke promotion if not confirmed
residency:
allowedRegionCodes: [EU] # no cross-family replication
Runbook Triggers & Events¶
- Triggers: health SLO breach, storage outage, operator signal, regulator directive.
- Events:
replication.lag_exceeded,dr.failover_initiated,dr.writer_promoted,dr.failback_completed. - Catalog updates: writer/replica roles are updated atomically and versioned; data-plane services must revalidate on change.
Guardrails (quick checklist)¶
- Single authoritative writer per tenant; no silent multi-writer.
- Replication and failover respect Residency Profile and network allow-lists.
- Promotions are time-boxed, audited, and reversible.
- Drills are scheduled, measured against RPO/RTO, and produce evidence packs.
Encryption & Keying per Region/Tenant¶
All persisted artifacts are encrypted at rest using envelope encryption. Keys are tenant-scoped and region-anchored to comply with residency and separation-of-duties requirements. Integrity/signature keys are isolated from encryption keys.
Key Hierarchy & Anchors¶
- Envelope model.
- DEK (Data Encryption Key): per-segment/partition/file; rotated frequently; never leaves the service in plaintext.
- KEK (Key Encryption Key): per-tenant and per-region; managed in KMS/HSM; wraps DEKs.
- Root/Anchor keys: region-scoped KMS/HSM masters; used to create/rotate KEKs; never exported.
- Isolation. Encryption KEKs and Integrity Signing Keys live under separate HSM partitions and roles.
- Tagging. Every artifact/manifest includes:
keyId,keyVersion,alg,cloudRegion,policyVersion, andresidencyProfileId.
KMS/HSM Residency & Access¶
- Region-scoped KMS/HSM. Keys used for
westeuropenever leave EU HSM boundaries; admin operations occur from the same region. - Private endpoints only. KMS/HSM access is via Private Link; no public network exposure.
- mTLS & least privilege. Services authenticate with workload identities; KMS policies restrict calls by
tenantId,regionCode, and operation type.
Rotation, Rekey & Destruction¶
- Cadence (defaults).
- DEK: rotate daily for hot stores; per object for cold archives.
- KEK: rotate quarterly (or on-demand on compromise).
- In-place rewrap. DEKs are rewrapped to the new KEK without re-encrypting payloads; manifests updated with new
keyVersion. - Sunset windows. Old KEK versions stay enabled for 7 days for rewrap completion, then disabled and later destroyed following a retention policy.
- Forensic hold. KEK destruction is denied while any dependent artifact is under legal hold.
Escrow & Recovery¶
- Escrow vault (same jurisdiction). KEK material is escrowed as wrapped blobs with an M-of-N recovery policy; escrow HSM stays within the same
RegionCode(e.g., EU). - Recovery drill. Quarterly: simulate KEK loss → restore via escrow → verify DEK unwrap → produce evidence pack.
- Cross-border prohibition. Escrow copies cannot be replicated outside the jurisdiction unless an explicit break-glass policy exists.
Separation of Duties (SoD)¶
| Role | Capabilities | Exclusions |
|---|---|---|
| Key Admin | Create/rotate/disable KEKs, manage HSM partitions, configure escrow | Cannot read data; cannot sign integrity |
| Security Officer | Approve rotations, grant time-bound access, initiate recovery drills | Cannot generate KEKs or export escrow blobs |
| Auditor | Read-only: view key events, policies, drill results | No operational access to KMS/HSM |
| Service Principal | DEK generate/wrap/unwrap for its tenant/region scope | No KEK admin; no cross-region operations |
All sensitive actions require dual approval (Key Admin + Security Officer) and produce immutable audit events.
Auditability & Evidence¶
- Events.
key.created,key.rotated,key.disabled,key.destroyed,key.unwrap_denied,key.rewrap_completed,escrow.restore_test_passed. - Logs. Every KMS call stamped with
{tenantId, regionCode, keyId, keyVersion, purpose, policyVersion, correlationId}. - Dashboards. Rotation currency, disabled-key backlog, unwrap-denied counts, escrow drill pass rate.
Service Flow (envelope encryption)¶
sequenceDiagram
autonumber
participant S as Service (Ingestion/Export)
participant K as KMS/HSM (Region)
participant ST as Storage (Region)
Note over S: Acquire DEK for segment
S->>K: GenerateDataKey(tenantId, regionCode)
K-->>S: { DEK_plain, DEK_wrapped_by_KEK }
S->>ST: Encrypt(payload, DEK_plain) → store(ciphertext)
S->>ST: Store manifest{ keyId, keyVersion, DEK_wrapped, alg, policyVersion }
S->>S: Zeroize(DEK_plain)
Note over S: On read: fetch DEK_wrapped → KMS Unwrap → decrypt
Example Policy Snippets¶
Key policy (per region/tenant)
keys:
region: westeurope
tenantScope: true
algorithms:
encrypt: AES-256-GCM
wrap: RSA-OAEP-256
sign: ECDSA-P256-SHA256
rotation:
dek: P1D
kek: P90D
sunset: P7D
escrow:
enabled: true
jurisdiction: EU
recovery:
quorum: { m: 2, n: 4 } # M-of-N recovery
restrictions:
crossBorder: deny
breakGlass:
allowed: false # set true only with regulator-approved profile
Service permission (sketch)
iam:
principal: atp-ingest
allow:
- kms:GenerateDataKey
- kms:Decrypt
conditions:
tenantId: <tenant>
regionCode: EU
deny:
- kms:ScheduleKeyDeletion
- kms:CreateKey
Guardrails (quick checklist)¶
- Keys are region-anchored; no cross-border unwrap.
- DEKs are short-lived; KEKs rotate on schedule and on incident.
- Integrity signing keys are separate from encryption keys.
- All key ops are dual-approved, logged, and evidenced.
- Escrow and recovery stay within the same jurisdiction; drills are regular and reported.
Retention Policy Model¶
Retention is policy-as-code: declarative, versioned, and evaluated deterministically on every lifecycle window. Defaults exist at the platform level; tenants may override within bounded ranges and at stream/category granularity.
Concepts¶
- Category taxonomy. Each record/segment belongs to one category:
evidence.hot(append/WORM),evidence.manifest,readmodel.warm,archive.cold,integrity.proof,export.bundle,ops.dlq,ops.log.
- Modes.
PURGE— physically delete when eligible.REDACT_TOMBSTONE— remove sensitive payloads, keep tombstone + lineage.ARCHIVE_THEN_PURGE— export to cold storage with signed manifest, then purge source.
- Windows. ISO-8601 durations (e.g.,
P7Y,P90D). Effective window is computed per record using the active policy version at write time unless a stricter policy is later applied (see “Stricter-Only Evolution”). - Eligibility. A record becomes Eligible when
createdAt + retentionWindow ≤ nowand there is no active hold or compliance block.
Defaults & Allowed Overrides¶
| Category | Default | Tenant Min | Tenant Max | Mode |
|---|---|---|---|---|
evidence.hot |
P7Y |
P1Y |
P10Y |
ARCHIVE_THEN_PURGE |
evidence.manifest |
P10Y |
P7Y |
P15Y |
REDACT_TOMBSTONE |
readmodel.warm |
P90D |
P30D |
P1Y |
PURGE |
archive.cold |
P10Y |
P7Y |
P15Y |
PURGE |
integrity.proof |
P15Y |
P10Y |
P20Y |
REDACT_TOMBSTONE |
export.bundle |
P2Y |
P90D |
P5Y |
PURGE |
ops.dlq |
P30D |
P7D |
P90D |
PURGE |
ops.log |
P180D |
P30D |
P2Y |
PURGE |
Tenants may set stricter retention (shorter window) within the allowed bounds. Loosening beyond platform maxima requires a governance exception.
Policy-as-Code Bundle (example)¶
policyVersion: 3.1.0
retention:
defaults:
evidence.hot: { window: P7Y, mode: ARCHIVE_THEN_PURGE }
evidence.manifest: { window: P10Y, mode: REDACT_TOMBSTONE }
readmodel.warm: { window: P90D, mode: PURGE }
archive.cold: { window: P10Y, mode: PURGE }
integrity.proof: { window: P15Y, mode: REDACT_TOMBSTONE }
export.bundle: { window: P2Y, mode: PURGE }
ops.dlq: { window: P30D, mode: PURGE }
ops.log: { window: P180D, mode: PURGE }
bounds:
min:
evidence.hot: P1Y
readmodel.warm: P30D
max:
evidence.hot: P10Y
archive.cold: P15Y
integrity.proof: P20Y
tenantOverrides:
# Example tenant with stricter windows
7c1a3a1d-...:
evidence.hot: { window: P5Y } # stricter than default
export.bundle: { window: P1Y }
streamOverrides:
# Specific stream/category override (e.g., financial evidence)
stream: "fin.ar"
category: evidence.hot
window: P10Y
mode: ARCHIVE_THEN_PURGE
exceptions:
legalHold: true
regulatorExtension: true
investigationHold: true
dryRun:
enabled: true
sampleRate: 0.05
Deterministic Evaluation¶
- Resolve category for the artifact.
- Load effective policy = defaults ⊕ tenantOverride ⊕ streamOverride (most specific wins) constrained by bounds.
- Compute eligibleAt =
createdAt + window. - Check exceptions (holds/blocks). If any, deny purge and record reason.
- Apply mode when eligible:
PURGE/REDACT_TOMBSTONE/ARCHIVE_THEN_PURGE. - Emit decision record with
policyVersion,reason,evidenceIds.
Decision table (sketch)
| Condition | Action | Event |
|---|---|---|
now < eligibleAt |
No-op | retention.window_pending |
hold.active = true |
Block | retention.purge_blocked |
mode = ARCHIVE_THEN_PURGE |
Export + Purge | retention.archive_emitted → retention.purged |
mode = REDACT_TOMBSTONE |
Redact + Tombstone | retention.redacted |
mode = PURGE |
Physical delete | retention.purged |
Lifecycle & States¶
stateDiagram-v2
[*] --> Active
Active --> Eligible: now >= eligibleAt
Eligible --> Blocked: LegalHold / RegExtension / InvestigationHold
Eligible --> Redacted: Mode=REDACT_TOMBSTONE
Eligible --> Archived: Mode=ARCHIVE_THEN_PURGE (export complete)
Archived --> Purged: Source removed
Eligible --> Purged: Mode=PURGE
Blocked --> Eligible: Hold released
Redacted --> [*]
Purged --> [*]
Evidence of Enforcement¶
- Jobs & cadence.
- Scheduler computes eligibility windows daily; worker executes purge/redact/export in small, idempotent batches.
- Staggered windows by tenant to avoid hotspots; per-tenant rate limits.
- Logs.
- Every decision includes
{tenantId, streamId, category, mode, policyVersion, eligibleAt, reason, correlationId}.
- Every decision includes
- Metrics.
retention.eligible.count,retention.purged.count,retention.redacted.count,retention.blocked.count,retention.archive.bytes,retention.lag.seconds.
- Audit records.
- Immutable ledger entries for policy changes (
retention.policy_changed), executions, and exceptions; evidence packs contain manifests, signatures, and batch checksums.
- Immutable ledger entries for policy changes (
Stricter-Only Evolution¶
- Forward tightening permitted. New policies can shorten windows for new data immediately.
- For existing data, tightening applies at the next evaluation but never violates legal/contractual minima.
- Loosening (longer windows) requires exception approval and is non-retroactive unless explicitly granted.
Dry-Run & Preview¶
- Dry-run mode. Compute would-be actions without mutating data; emit
retention.dryrun.statsby tenant/category. - Preview API (sketch).
GET /retention/preview?tenantId=...&category=...&window=...- Returns counts, byte sizes, and earliest/latest
eligibleAtfor the filter.
Export-Before-Delete (for ARCHIVE_THEN_PURGE)¶
- Manifest fields.
{manifestId, policyVersion, tenantId, category, batchRange, counts, bytes, checksum, proofRefs[], keyId, keyVersion}. - Write-once class storage with lifecycle; signed and timestamped.
- Verification. Post-export validation compares counts/hashes; purge proceeds only after a green check.
Guardrails (quick checklist)¶
- Retention windows are declarative, bounded, and versioned.
- Decisions are idempotent, audited, and explainable.
- Holds always override retention; purges never run during active holds.
integrity.proofandevidence.manifestretain enough lineage for future verification.- Dry-run is available and on by default in new regions until validated.
- Purge workers respect tenant rate limits and produce signed manifests when archiving.
Legal Hold, DSAR, and Exceptions¶
Legal holds freeze lifecycle actions for matching data. DSAR pipelines export in-region and, where lawful, delete/redact. When policies collide, legal hold wins, with auditable escalation.
Semantics & Scope¶
- Immutable holds. Created once; never edited. Adjustments are additive (new hold) or by release.
- Targeting. Holds and DSAR requests can scope by:
tenantId(required),streamId/category, time-range (from/to), attributes filter (KQL/SQL-lite), and purpose (caseId,regulatorRef).
- Effect. While active, the hold blocks purge/redact/delete but allows reads/exports as policy permits.
Hold Types¶
| Type | Typical trigger | Blocks | Expires |
|---|---|---|---|
| LegalHold | Litigation/Discovery | Purge/Redact/Delete | Manual release only |
| RegulatorExtension | Regulator directive (e.g., retention+) | Purge/Delete (may allow Redact) | Date- or directive-bound |
| InvestigationHold | Security/forensics | Purge/Delete (optional Redact) | Time-bound with review |
Holds are region-aware: created and enforced in the tenant’s authoritative
CloudRegion. Replicated reads must honor hold state.
Record Model (sketch)¶
hold:
id: hold_01HZXK...
tenantId: <uuid>
type: LegalHold | RegulatorExtension | InvestigationHold
scope:
category: evidence.hot|...
streamIds: [ "aud.gateway", "fin.ar" ]
time:
from: 2022-01-01T00:00:00Z
to: 2024-12-31T23:59:59Z
filter: "actorId = 'svc.export' AND status = 'accepted'"
purpose:
caseId: "CASE-2025-0412"
regulatorRef: null
description: "Discovery request ACME vs Contoso"
createdBy: <user/principal>
createdAt: 2025-10-29T09:10:00Z
policyVersion: 3.1.0
region: westeurope
status: active | released
releasedBy: null
releasedAt: null
Lifecycle & Events¶
stateDiagram-v2
[*] --> Draft
Draft --> Active: create(dual-approval)
Active --> Released: release(dual-approval)
Released --> [*]
Events
hold.created(immutable record),hold.released,hold.expired(if time-bound),hold.scope_conflict_detected(on overlapping holds),hold.violation_attempted(blocked action).
Audit
- Every decision stores
{holdId, type, scopeHash, policyVersion, reason, correlationId}in an immutable log.
Decision Precedence¶
When multiple rules apply to the same artifact:
- LegalHold (highest)
- RegulatorExtension
- DSAR Delete/Erasure
- Retention Window / Purge
Precedence table (sketch)
| Condition | Result | Event |
|---|---|---|
| Hold active | Block delete | retention.purge_blocked |
| No hold, DSAR delete lawful | Delete/redact | dsar.delete_completed |
| RegulatorExtension with redact-only | Redact | retention.redacted |
| None of the above, retention eligible | Purge | retention.purged |
DSAR Pipeline (Export & Delete)¶
Export path (always residency-aware)
- Intake & verify identity (KYC-level appropriate); record
requestId, requester, lawful basis. - Scope resolution (tenant/streams/time/filter) → preview counts/bytes.
- Collect in-region: query authoritative region first; replicas only if allowed.
- Transform: field-level redaction/minimization (policy-driven), PII dictionary expansion if configured.
- Package: NDJSON/Parquet + signed manifest (hashes, counts, key lineage, policy version).
- Deliver via approved
ExportRoute(e.g.,in_region): time-boxed URL, 2FA retrieval. - Log:
dsar.export_completedwith metrics and manifest URL.
flowchart TD
A[Intake]-->B[Identity Verify]
B-->C[Scope/Preview]
C-->D[Collect - in-region]
D-->E[Redact/Minimize]
E-->F[Package + Sign]
F-->G[Deliver via allowed route]
G-->H[Log & Evidence Pack]
Deletion/Erasure path (where lawful)
- Evaluate conflicts: if any hold/regulator block → deny with reason and schedule re-check on release.
- Mode selection by category:
evidence.hot,integrity.proof,evidence.manifest: use REDACT_TOMBSTONE (preserve lineage & proofs).readmodel.warm,ops.*,export.bundle: PURGE.
- Idempotency: re-running the same
requestIdmust not duplicate effects. - Events:
dsar.delete_requested→dsar.delete_completedordsar.delete_denied.
API Sketch (control-plane)¶
POST /holds # body: hold record; requires dual-approval
POST /holds/{id}/release
GET /holds?tenantId=...&status=active
POST /dsar/export/preview # returns counts/bytes by scope
POST /dsar/export/start # kicks off export job; returns manifest link on completion
POST /dsar/delete/start # attempts erasure per policy; returns decision log
GET /dsar/requests/{id} # status, evidence pack links
Operator Workflow (Escalation)¶
- Conflict detected (DSAR delete vs active hold):
- Notify Data Protection Officer + Legal with full context.
- Provide DSAR export (if requested) but suppress deletion.
- Record escalation outcome (
approved,denied,clarification) with timestamps. - Auto-create re-check task on
hold.released.
Policy Snippets¶
Holds policy
holds:
dualApproval: true
redactAllowedDuringInvestigation: false
regionScope: authoritative_only
overlapStrategy: additive # multiple holds accumulate scope
logging:
evidenceLevel: full # include scope hash & reason codes
DSAR policy
dsar:
export:
formats: [ ndjson, parquet ]
route: in_region
redact:
enabled: true
templates:
- name: pii-default
rules:
- field: email
action: hash
- field: phone
action: hash
- field: ssn
action: remove
delete:
modesByCategory:
evidence.hot: REDACT_TOMBSTONE
integrity.proof: REDACT_TOMBSTONE
readmodel.warm: PURGE
export.bundle: PURGE
conflictPrecedence: [ LegalHold, RegulatorExtension, DSAR, Retention ]
Metrics, Logs & Evidence¶
- Metrics:
holds.active.count,holds.scope.bytes,dsar.exports.count,dsar.deletes.count,dsar.denied.count,dsar.export.latency.seconds. - Logs: every decision stamped with
{tenantId, scopeHash, policyVersion, reason, holdIds[], dsarRequestId, correlationId}. - Evidence packs: export manifests, signature receipts, deletion redaction maps, approval records.
Guardrails (quick checklist)¶
- Holds are immutable; only release ends their effect.
- DSAR exports never cross disallowed borders; delivery enforces MFA + TTL.
- Deletion is mode-aware and idempotent; sensitive categories prefer redaction + tombstone.
- Precedence is enforced; conflicts escalate with full audit trail.
- All actions are region-scoped, policy-versioned, and provable.
Deletion & Purge Workflows¶
Purge removes only those artifacts that are eligible under retention and not protected by holds or exceptions. Hot stores remain WORM (append-only); purge affects eligible segments/objects, while read models reconcile via a purge ledger.
Eligibility & Decisioning¶
- Eligibility conditions
now ≥ eligibleAt(from retention policy evaluation)- No active hold (Legal/Regulator/Investigation)
- Approvals satisfied (if category requires explicit approval)
- No pending DSAR delete conflict (blocked until resolved)
- Decision order
- Resolve retention (window & mode)
- Check holds/exceptions
- Validate residency/export route (for
ARCHIVE_THEN_PURGE) - Enforce approvals (if required)
- Execute per mode
Decision table (concise)
| Mode | Condition met | Action | Events |
|---|---|---|---|
PURGE |
Eligible & unheld | Physical delete | retention.purged |
REDACT_TOMBSTONE |
Eligible & unheld | Redact payload, write tombstone | retention.redacted |
ARCHIVE_THEN_PURGE |
Eligible & unheld | Export bundle → verify → purge | retention.archive_emitted → retention.purged |
| Any | Hold active / approval missing | Block & reschedule | retention.purge_blocked / retention.approval_pending |
Batch Execution Model¶
- Scheduler selects eligible items by tenant/category in small batches (idempotent; deterministic ordering by
(tenantId, category, eligibleAt, artifactId)). - Workers perform export/redact/purge with per-tenant rate limits and back-pressure (retries with jitter).
- Ledger-first: every action writes a Purge Ledger record before mutating state; mutations carry the ledger’s
ledgerIdfor traceability. - Reconciliation: warm read models subscribe to
retention.purged/redactedto remove/compact derived rows.
sequenceDiagram
autonumber
participant SCH as Scheduler
participant WRK as Purge Worker
participant EXP as Export Service
participant ST as Storage
SCH->>WRK: Fetch eligible batch (tenant/category)
WRK->>WRK: Evaluate holds/approvals/mode
WRK->>WRK: Write Purge Ledger (Pending)
alt ARCHIVE_THEN_PURGE
WRK->>EXP: Start export(jobId, scope, policyVersion)
EXP-->>WRK: Manifest{hashes, counts, signature}
end
WRK->>ST: Execute redact/purge with preconditions
WRK->>WRK: Update Purge Ledger (Committed)
WRK-->>SCH: Emit events & metrics
Purge Ledger (record schema)¶
ledgerId: plg_01HT2...
tenantId: <uuid>
category: evidence.hot | readmodel.warm | ...
mode: PURGE | REDACT_TOMBSTONE | ARCHIVE_THEN_PURGE
scope:
streamId: "aud.gateway"
timeRange: { from: 2023-01-01T00:00:00Z, to: 2023-12-31T23:59:59Z }
artifacts: [ "seg_0001", "seg_0002", ... ] # optional when summarized by range
policy:
version: 3.1.0
reason: "retention.window_elapsed"
holds: [] # active hold IDs at decision time
export:
manifestId: man_01GZ...
url: s3://... (write-once class)
checksums: [sha256, blake3]
signature: cms/rfc3161
result:
status: pending | committed | blocked | failed | retried
counts: { purged: 1543, redacted: 0, archivedBytes: 1_024_000_000 }
startedAt: 2025-10-29T08:10:00Z
completedAt: 2025-10-29T08:12:41Z
audit:
correlationId: 6b3f...
operator: "svc-retention" # workload identity
approvals: [ "apr_..." ]
region: westeurope
Export-Before-Delete (for ARCHIVE_THEN_PURGE)¶
- Manifest content
manifestId,tenantId,category,batchRange,counts,bytes,hashes[],proofRefs[],keyId,keyVersion,policyVersion,createdAt.
- Storage class
- Write-once/immutable with lifecycle transitions to archive tier (meets residency).
- Verification
- Compare counts & hashes with source; require signature & timestamp receipt.
- Gate
- Purge proceeds only after green verification; otherwise rollback and mark
failed.
- Purge proceeds only after green verification; otherwise rollback and mark
WORM & Physical Delete¶
- WORM guarantee
- Hot append stores are append-only; no in-place updates.
- Physical delete
- Removes entire eligible segment/object atomically; index pointers and warm projections are invalidated via purge events.
- Tombstones
- When mode is
REDACT_TOMBSTONE, write a minimal record with lineage:{artifactId, removedAt, reason, ledgerId, proofRefs[]}.
- When mode is
Idempotency, Concurrency & Safety¶
- Idempotency keys:
(tenantId, category, batchKey, policyVersion); safe to retry. - Optimistic preconditions: storage deletes with
if-match/generation checks. - Leases: workers hold short leases per batch; lost leases re-queue.
- Shard hot-spot avoidance: stagger by
eligibleAt % windowand per-tenant budgets. - Failure handling: exponential backoff with jitter; circuit-breaker on repeated provider errors; escalation after
Nfailures.
Eligibility Query (example, SQL-ish pseudocode)¶
SELECT artifact_id
FROM evidence_segments
WHERE tenant_id = @tenant
AND category = 'evidence.hot'
AND eligible_at <= @now
AND hold_active = FALSE
AND approval_required = FALSE
ORDER BY eligible_at, artifact_id
LIMIT @batch_size;
Control-Plane APIs (sketch)¶
GET /retention/eligible?tenantId=...&category=...&limit=...
POST /retention/purge/start # starts a batch; returns ledgerId(s)
GET /retention/ledger/{id} # status & evidence links
POST /retention/purge/approve # add approvals to a pending batch
Metrics, Logs & Events¶
- Metrics
retention.eligible.count,retention.purged.count,retention.redacted.count,retention.archive.bytes,retention.blocked.count,purge.duration.seconds,purge.retry.count,purge.fail.rate.
- Logs
- Decision logs with
{tenantId, category, mode, policyVersion, eligibleAt, ledgerId, reason, counts}.
- Decision logs with
- Events
retention.window_elapsed,retention.archive_emitted,retention.redacted,retention.purged,retention.purge_blocked,retention.approval_pending,retention.export_verification_failed.
Policy Snippets¶
purge:
schedule:
window: "00:00-06:00 local" # quiet hours per region
cadence: "PT15M"
batch:
size: 500
maxConcurrentTenants: 5
perTenantRps: 50
approvals:
requiredFor: [ "evidence.hot", "integrity.proof" ]
dual: true
export:
enabled: true
formats: [ ndjson, parquet ]
writeOnceClass: true
verify: { hashes: [sha256, blake3], signature: cms, timestamp: rfc3161 }
safety:
dryRun: false # can be toggled region-by-region
maxRetries: 8
backoff: "decorrelated_jitter"
Guardrails (quick checklist)¶
- Never purge with active holds or unresolved DSAR conflicts.
- Export verification must pass before any archive-then-purge action.
- All mutations reference a Purge Ledger record and emit events.
- Operations are region-local and comply with residency & network boundaries.
- Purge workers are idempotent, rate-limited, and audited end-to-end.
Evidence Immutability & Integrity¶
Evidence is append-only and organized into hash-linked segments per stream. Each segment is sealed with a region-local anchor and (optionally) an external timestamp. Proofs are verifiable on read and on schedule, with migration procedures that re-anchor while preserving provenance.
Model & Invariants¶
- Leaves → Segment → Stream.
- Leaf: normalized record payload →
h = H(payload || meta). - Segment: bounded set of leaves with a Merkle root
R_i; includes previous segment’s rootR_{i-1}in its header to form a chain. - Stream: ordered chain of sealed segments:
R_0 ⇒ R_1 ⇒ … ⇒ R_n.
- Leaf: normalized record payload →
- WORM. Leaves and sealed segments are immutable; only new segments may be appended.
- Deterministic hashing. Canonical serialization (stable field order, normalized encodings); algorithms are versioned (e.g.,
sha256@1). - Anchor. A signed structure
{streamId, segmentId, R_i, prev=R_{i-1}, keyId, keyVersion, policyVersion, createdAt}stored in-region and optionally timestamped externally.
flowchart LR
L1[Leaf h1] --> M((Merkle Tree))
L2[Leaf h2] --> M
L3[Leaf h3] --> M
M --> R[Segment Root R_i]
R --> A["Anchor A_i (signed)"]
A --> C{Chain}
C -->|"prev=R_{i-1}"| Aprev["Anchor A_{i-1}"]
Anchoring & Timestamping¶
- Region-local anchor store. Anchors are written using a sign-only HSM key in the tenant’s authoritative region.
- Seal cadence. Segments seal on size or time threshold (e.g.,
100k leavesorPT15M), whichever comes first. - External timestamp (optional). Submit
R_i(or an anchor digest) to a trusted timestamping service (e.g., RFC 3161 TSA or public blockchain notarization). Store receipt in the anchor. - Key lineage. Anchors include
{keyId, keyVersion, alg}; rotations update future anchors without invalidating the chain.
Anchor (example)
{
"streamId": "aud.gateway",
"segmentId": "seg-000142",
"root": "b64:R_i",
"prev": "b64:R_{i-1}",
"alg": "sha256@1",
"keyId": "hsm-eu-01",
"keyVersion": "7",
"policyVersion": "3.1.0",
"createdAt": "2025-10-29T08:30:12Z",
"tsa": {
"type": "rfc3161",
"token": "b64:...",
"tsaCertId": "tsa-eu-01"
},
"sign": "cms:b64..."
}
Verification Procedures & Cadence¶
- On-write checks
- Recompute leaf hashes, verify Merkle root, ensure
prevmatches last sealed root, validate HSM signature.
- Recompute leaf hashes, verify Merkle root, ensure
- On-read policy
- Default: verify-on-read for sensitive categories; otherwise sampled verification with a sliding window.
- Scheduled audits
- Daily: verify last
Nsegments per active stream (sampled). - Weekly: full verification of all segments sealed in the last week.
- Quarterly: pick a tenant cohort and verify end-to-end (leaves → anchors → external timestamps).
- Daily: verify last
- Failure handling
- Any mismatch → quarantine stream, raise
integrity.violation_detected, and freeze exports for that scope until resolved. - Anchors with expired/invalid TSA receipts trigger
integrity.tsa_invalid.
- Any mismatch → quarantine stream, raise
sequenceDiagram
autonumber
participant Q as Query Service
participant P as Proof Verifier
participant S as Anchor Store (Region)
participant T as TSA
Q->>P: Verify(streamId, segmentId, proofs?)
P->>S: Fetch Anchor A_i + Segment Manifest
P->>P: Rebuild Merkle Root from leaves
P->>P: Check A_i signature (HSM key)
alt Has TSA
P->>T: Validate TSA token (offline/OCSP cache)
end
P-->>Q: {status: ok|fail, reason, evidenceRefs}
Migration Integrity (Region Move)¶
When moving a tenant to a new region:
- Freeze window. Stop-seal at a segment boundary; last anchor =
A_n. - Copy & verify. Copy segments and anchors to the target region; verify all anchors and roots.
- Create bridge anchor. In the target region, create
A_bridgewith:prev = R_n(last source root),root = H("MIGRATION|" || tenantId || srcRegion || dstRegion || time || R_n),- signed by target region HSM and timestamped.
- Resume sealing. New segments in target chain reference
A_bridge.rootasprev. - Provenance manifest. Emit a signed manifest explaining source/target regions, key IDs, counts, byte sizes, and verification results; retain source anchors per retention.
flowchart TD
S[Source Chain: R_0..R_n] --> B["Bridge Anchor A_bridge(prev=R_n)"]
B --> T[Target Chain: R'_1..R'_m]
Bridge manifest (excerpt)
migration:
tenantId: 7c1a-...
from: westeurope
to: northeurope
cutoverAt: 2025-11-05T03:00:00Z
lastSourceRoot: "b64:R_n"
bridge:
anchorId: "A_bridge"
keyId: "hsm-eu-nor-01"
tsa: "rfc3161"
verification:
segmentsChecked: 124
leavesChecked: 3_482_901
result: pass
Proof Storage & Manifests¶
- Segment manifest.
{segmentId, leaves[], root, prev, counts, bytes, createdAt, alg, keyId, keyVersion}stored alongside data in write-once class storage. - Proof references. Each export bundle includes
proofRefs[]pointing to segment/anchor IDs that cover the exported data range. - Compression & batching. Large leaf sets store compact Merkle paths; verification reconstructs roots from supplied paths.
Policy Snippets¶
integrity:
hash:
algorithm: sha256@1
canonical: json-c14n-v2
segment:
maxLeaves: 100000
maxInterval: PT15M
anchoring:
hsmPartition: integrity
regionLocal: true
tsa:
enabled: true
type: rfc3161
timeout: PT5S
verify:
onRead: selective
sampleRate: 0.1 # 10% sampled verification
dailySegments: 50
weeklyFull: true
migration:
requireBridgeAnchor: true
provenanceManifest: true
freezeAtBoundary: true
Metrics, Logs & Events¶
- Metrics
integrity.anchors.created.count,integrity.verify.ok.count,integrity.verify.fail.count,integrity.tsa.latency.ms,integrity.segment.size.bytes,integrity.chain.depth.
- Logs
- Each anchor:
{streamId, segmentId, root, prev, keyId, tsaStatus, policyVersion, correlationId}. - Each verification:
{scope, sampleSize, ok, failures[], reasonCodes[]}.
- Each anchor:
- Events
integrity.anchor_sealed,integrity.verify_passed,integrity.violation_detected,integrity.tsa_invalid,integrity.migration_bridged.
Guardrails (quick checklist)¶
- Hashing is canonical and versioned; changes are additive with dual-run validation.
- Anchors are region-local and signed with HSM; optional TSA for external time attestation.
- Verification runs on read (for sensitive paths) and on scheduled audits; failures quarantine scope.
- Migrations re-anchor with a bridge anchor and provenance manifest; source anchors are retained per policy.
- Export bundles always include proof references covering the data range.
Residency-Aware Access Controls¶
Access is deny-by-default and evaluated with ABAC: the decision combines subject claims, request context, and resource attributes (region/silo/category). Cross-region access is blocked unless a policy-approved exception applies. All exceptions are time-bound, least-privilege, and fully audited.
ABAC Model¶
Subjects (who)
- Users/Services authenticated via OIDC/workload identity.
- Token carries residency hints and purpose:
tenant_id,region_code,residency_profile,data_silo_idscopes(read/write/export/admin)purpose(e.g.,dsar_export,ops_triage,dr_failover_review)break_glass(bool),approval_id(when applicable)
Resources (what)
- Evidence segments, read-model indices, archives/exports, anchors.
- Annotated with:
tenantId,dataSiloId,cloudRegion,regionCode,category.
Context (how/why)
- Operation:
write,read,export,replicate,migrate,admin. - Network posture: in-region vs cross-region, private vs public path.
- Residency profile + policy version in effect.
Gateway & Service Enforcement¶
-
Gateway (PEP-1) evaluates coarse residency rules:
- Reject if
token.region_code != resource.regionCodeand no policy exception. - Enforce purpose binding (e.g., only
dsar_exportmay call DSAR endpoints). - Propagate decision context to services (
x-policy-decision,policyVersion,reason).
- Reject if
-
Service (PEP-2) enforces fine-grained checks:
- Validate
tenantId/dataSiloIdalignment. - For
export, verifyexportRouteis allowed byresidency_profile. - For
readacross regions, requirereplicate_readin profile. - Persist the decision record with correlation IDs.
- Validate
flowchart TD
A[Request + JWT] --> G[Gateway PEP-1]
G -->|Allow + x-policy-decision| S[Service PEP-2]
G -->|Deny| X[403 GuardViolation]
S -->|Allow| R[Resource Access]
S -->|Deny| X
Claims & Headers (sketch)¶
{
"sub": "svc-query",
"tenant_id": "7c1a-...",
"region_code": "EU",
"residency_profile": "gdpr-standard",
"data_silo_id": "silo-7c1a...",
"scopes": ["evidence.read", "index.query"],
"purpose": "default",
"break_glass": false,
"policy_version": "3.1.0"
}
X-Policy-Decision: allow|deny|quarantine
X-Policy-Reason: cross-region-read-blocked|export-route-denied|ok
X-Policy-Version: 3.1.0
X-Region-Code: EU
X-Data-Silo-Id: silo-7c1a...
Residency Rules (essentials)¶
- Same-region first.
writeandappendmust be in the authoritative CloudRegion. - Cross-region reads allowed only if
residency_profile.movement = replicate_readand target region’sRegionCodeis permitted. - Exports must land on an allowed
exportRoute(in_region,same_code,global). - Admin/replicate/migrate restricted to operator identities with purpose-scoped approval.
Example Policy (declarative sketch)¶
abac:
default: deny
rules:
- id: write_in_region
when:
op: write
token.region_code == resource.regionCode
allow: true
- id: read_in_region
when:
op: read
token.tenant_id == resource.tenantId
token.region_code == resource.regionCode
allow: true
- id: read_cross_region_if_profile_allows
when:
op: read
token.tenant_id == resource.tenantId
token.region_code != resource.regionCode
profile.movement == replicate_read
profile.allowedRegionCodes contains token.region_code
allow: true
- id: export_route_guard
when:
op: export
token.purpose in [dsar_export, ediscovery]
export.route in profile.allowedExportRoutes
allow: true
- id: break_glass
when:
token.break_glass == true
token.approval_id != null
approval.ttl_valid == true
approval.scope matches {tenantId, category, regionCode}
allow: true
log_as: break_glass_used
Break-Glass (controlled exception)¶
- Least privilege: narrowed to specific tenant/category/region and operation.
- Time-bound: short TTL (e.g., ≤ 4h) with automatic revocation.
- Dual approval: Security + Data Protection Officer (or on-call Legal).
- Full audit: record
{who, why, scope, ttl, evidence}; trigger post-mortem task. - No persistence bypass: does not disable retention/holds; only bypasses residency guard for the scoped action.
Purpose Binding¶
- Every call carries a purpose claim; services log and enforce purpose:
dsar_export→ DSAR endpoints only; auto-log link to DSAR request ID.ops_triage→ read-only diagnostic endpoints; no export.dr_failover_review→ read replicas; no writes unless promotion approved.
Decision Cache & Revocation¶
- Short-lived decision cache (seconds) at PEPs to reduce latency.
- Revocation channels: on hold changes, residency profile updates, or approval revocations, PEPs flush caches and re-evaluate.
- Clock skew protections: accept tokens only within tight
nbf/expwindows; prefer confidential client flows for services.
Test Matrix (examples)¶
| Scenario | Expect |
|---|---|
| User EU token reading EU resource | Allow |
| Service EU token reading US resource | Deny |
Service EU token reading EU replica (EU↔EU) with replicate_read |
Allow |
Export with exportRoute=global under gdpr-standard |
Deny |
| Break-glass with valid approval TTL | Allow + break_glass_used event |
Token missing data_silo_id |
Deny |
Metrics, Logs & Events¶
- Metrics:
abac.allow.count,abac.deny.count,abac.quarantine.count,break_glass.used.count,policy.decision.latency.ms. - Logs:
{tenantId, dataSiloId, op, subject, resource.regionCode, token.region_code, policyVersion, decision, reason, purpose, approvalId?}. - Events:
abac.decision_denied,abac.policy_updated,break_glass.used,break_glass.revoked.
Guardrails (quick checklist)¶
- Deny by default; purpose is mandatory on sensitive routes.
- Region/silo attributes must be present in both token and resource.
- Cross-region access requires explicit profile allowance.
- Break-glass is time-boxed, dual-approved, and audited; no blanket overrides.
- Policy and decision telemetry are immutable and tied to a policy version.
Backups, Restore & eDiscovery¶
Backups are region-coherent, immutable, and encrypted; restores are tenant/stream/time-scoped and fully logged; eDiscovery exports use standard formats with signed manifests and residency-compliant delivery.
Backup Strategy¶
- Region coherence. Each tenant’s authoritative data is backed up in the same CloudRegion using write-once classes. Optional replication is permitted only within allowed RegionCodes per residency profile.
- Classes & cadence.
- Hot (append/WORM): periodic segment snapshots + delta logs (e.g., object snapshots + queue offsets). Cadence: hourly incrementals, daily full.
- Warm (read models): rebuild-first policy (prefer re-projection from hot). Optional weekly snapshots for faster RTO.
- Cold (archives/exports): object storage with immutable retention policies; lifecycle transitions to colder tiers.
- Encryption & keys. Backup artifacts are encrypted with per-tenant KEKs anchored in region-scoped KMS/HSM. Manifests carry
keyId/keyVersion. - Consistency.
- Application-consistent fences: short quiesce at seal boundaries for a consistent cut.
- Point-in-time: capture journal positions (e.g., LSN/offset) to enable exact time-based restores.
- Catalog. A regional Backup Catalog records backup sets and references to manifests; control-plane is read-only to data-plane services.
Backup manifest (excerpt)
backup:
id: bkp_01HZ...
tenantId: 7c1a-...
region: westeurope
startedAt: 2025-10-28T02:00:00Z
completedAt: 2025-10-28T02:07:43Z
classes:
hot:
type: full
segments: { from: seg-000120, to: seg-000145 }
journalOffset: 912345
warm:
type: skip_rebuild_first
cold:
type: catalog_only
encryption:
keyId: hsm-eu-01
keyVersion: "8"
checksums:
algo: sha256
files: [{ path: ".../seg-000145.snap", hash: "..." }]
policyVersion: 3.1.0
Restore Workflows¶
- Scopes. Restores are requested by tenant, optional stream/category, and time-range or point-in-time (
restoreAt). - Landing zone. Data is restored to a quarantine namespace (tenant-local) with read-only posture until validated.
- Integrity verification. Post-restore, verify:
- segment checksums and Merkle roots (proof refs),
- anchor signatures & TSA receipts,
- journal continuity (no gaps/overlaps).
- Warm rebuild. Read models are re-projected from restored hot segments; cached indices snap to the restored window.
- Immutability & evidence. Every restore produces an immutable Restore Log with scope, artifacts, hashes, operator identity, and results.
sequenceDiagram
autonumber
participant OP as Operator/Service
participant CAT as Backup Catalog (Region)
participant STO as Backup Store (Write-Once)
participant RST as Restore Controller
participant INT as Integrity Verifier
OP->>CAT: Request restore {tenant, stream?, time/window}
CAT-->>OP: Best set + manifests
OP->>RST: Start restore(jobId, set)
RST->>STO: Fetch artifacts (Private Link)
RST->>INT: Verify checksums, anchors, TSA
INT-->>RST: ok/fail + evidence
RST-->>OP: Restore to quarantine namespace (read-only)
RST-->>CAT: Append immutable Restore Log
Restore log (immutable)
restore:
id: rst_01HZX...
tenantId: 7c1a-...
region: westeurope
scope: { streams: ["aud.gateway"], time: { from: 2025-10-01, to: 2025-10-07 } }
sourceBackupId: bkp_01HZ...
landing: namespace: "tenant-7c1a-rst-20251029", posture: read_only
verification:
segmentsChecked: 26
anchorsChecked: 26
tsaValidated: true
result: pass
approvals: ["apr_..."] # if required
startedAt: 2025-10-29T07:10Z
completedAt: 2025-10-29T07:24Z
policyVersion: 3.1.0
operator: svc-restore
Safety rails
- Restore is region-local unless a policy-approved migration/DR path exists.
- Any Legal Hold or RegulatorExtension applies to the restored scope (no purge/redact during validation).
- Promotion from quarantine to active requires checks to pass + explicit approval.
eDiscovery Exports¶
- Formats. NDJSON (line-delimited JSON) and Parquet (columnar) with stable schemas; include dictionaries for enumerations.
- Residency routes. Delivery must comply with
exportRoute(in_region,same_code,global). Default for GDPR-like profiles:in_region. - Redaction/minimization. Apply policy-driven transformations (hash/remove) for PII/secret fields; include redaction map in evidence.
- Signing & manifests. Every export bundle includes a signed manifest with:
manifestId,tenantId,region,format,schemaVersion,counts,bytes,hashes[], proofRefs[] (segment/anchor coverage),keyId/keyVersion,policyVersion, timestamp receipt (TSA optional).
- Delivery. Time-bound, MFA-protected download URLs; checksum verified client-side; optional delivery to an approved in-region bucket.
Export manifest (excerpt)
ediscovery:
manifestId: man_01HZK...
tenantId: 7c1a-...
region: westeurope
format: parquet
schemaVersion: 2.4.1
counts: { rows: 5_423_188, files: 12 }
bytes: 87_331_002_881
hashes: [{ file: "part-0000.parquet", sha256: "..." }]
proofRefs: [{ stream: "aud.gateway", fromSeg: "000130", toSeg: "000145" }]
encryption: { keyId: "hsm-eu-01", keyVersion: "8" }
policyVersion: 3.1.0
tsa: { type: rfc3161, token: "b64:..." }
Policies & Runbooks¶
Policy snippets
backup:
regionLocal: true
hot:
full: P1D
incremental: PT1H
warm:
strategy: rebuild_first
cold:
immutability: true
lifecycle: { toArchiveAfter: P30D }
restore:
quarantine:
posture: read_only
ttl: P7D
approvals:
requiredFor: [ "tenant_wide", "cross_region" ]
ediscovery:
formats: [ ndjson, parquet ]
exportRoute: in_region
redactTemplates: [ pii-default ]
sign: { cms: true, tsa: true }
Operator runbook (high level)
- Backup: monitor job SLOs; validate manifests on completion.
- Restore: request scope → land in quarantine → verify → approve promote → reproject warm.
- eDiscovery: validate scope → generate bundle → deliver via approved route → attach manifest to ticket.
Metrics, Logs & Events¶
- Metrics
backup.success.count,backup.duration.seconds,backup.bytes.total,restore.success.count,restore.duration.seconds,restore.verify.fail.count,ediscovery.exports.count,ediscovery.bytes.total.
- Logs
- Backup/restore decisions include
{tenantId, region, scope, policyVersion, counts, bytes, hashes[], proofRefs[]}.
- Backup/restore decisions include
- Events
backup.completed,backup.failed,restore.started,restore.completed,restore.failed,restore.verification_failed,ediscovery.export_completed.
Guardrails (quick checklist)¶
- Backups are region-coherent, immutable, and encrypted with region-anchored keys.
- Restores land in quarantine (read-only) and are promoted only after integrity verification.
- eDiscovery exports follow residency routes, apply redaction, and include signed manifests with proof refs.
- All operations are audited and produce evidence packs suitable for external review.
Cost & Performance Considerations¶
Design for in-region performance first, with tiered storage and explicit cost guardrails. Cross-region actions are blocked or flagged by policy. Quotas and rate plans prevent noisy neighbors from degrading SLOs.
Tiering Strategy (Hot/Warm/Cold)¶
- Hot (append/WORM) — authoritative segments and recent anchors; low latency, high IOPS, highest cost.
- Warm (read/search) — projections and indexes optimized for query; rebuilt from hot as needed; medium cost.
- Cold (archive/export) — immutable object storage with long retention; bulk throughput; lowest cost.
Lifecycle transitions
- Policy gates. Move artifacts from Hot → Warm → Cold only when residency and retention windows allow; transitions emit lifecycle.transitioned.
- Hydration. On re-ingest/re-index, hydrate from Cold (if allowed) to Warm; cache results in-region.
- Hysteresis. Minimum dwell time per tier to avoid thrashing (e.g., Warm ≥ 7 days before Cold).
Example policy (tiering)
tiering:
hot:
targetWindow: P14D
warm:
targetWindow: P90D
rebuildFirst: true # prefer re-project vs snapshot storage
cold:
storageClass: archive_immutable
lifecycle:
transitionAfter: P90D
deleteAfter: P10Y
residency:
crossRegionHydrate: deny # never hydrate across region families
Query Acceleration (In-Region)¶
- Locality: Route reads to in-region indices; cross-region queries deny-by-default or add a cost flag requiring approval.
- Projections: Maintain category-specific materialized views (e.g., latest per subject, time-bucket summaries).
- Caches: Per-tenant query caches with TTL & invalidation on
retention.*andintegrity.*events. - Indexes: Time+tenant composite keys; segment-time partitioning for predictable pruning.
- Batching: Encourage windowed queries (e.g., 7d/30d) with pagination tokens; throttle ad-hoc unbounded scans.
Cross-Region Cost Controls¶
- Block by default. Any cross-region data path (
read,hydrate,export) is blocked unless the profile explicitly allows it. - Cost flags. If permitted, compute estimated egress (bytes × provider egress rate) and require user confirmation or approval budget.
- Budget events. Emit
cost.budget_check_failedif request exceeds the tenant’s monthly cap.
costGuards:
crossRegion:
default: deny
estimateBeforeExecute: true
requireApprovalAboveBytes: 50GB
egressRates:
EU->EU: 0.00 # within same region code (if allowed) may be discounted
EU->US: 0.09 USD/GB
Quotas & Rate Plans (Per Tenant Tier)¶
| Tier | Hot Storage Quota | Warm Index Quota | Cold Archive Quota | Egress Cap / mo | Export Window | Default Concurrency |
|---|---|---|---|---|---|---|
| Standard | 500 GB | 300 GB | 5 TB | 50 GB | Off-peak only | 10 workers |
| Business | 2 TB | 1 TB | 20 TB | 200 GB | Off-peak + on-demand | 25 workers |
| Enterprise | 10 TB | 5 TB | 100 TB | 1 TB | Any (policy-gated) | 60 workers |
- Burst buckets: Short-term bursts are allowed (e.g., 20% over cap for ≤ 24h) with auto-throttle afterward.
- Soft vs hard caps: Soft cap triggers alerts & throttling; hard cap blocks new writes/exports (excluding integrity/holds metadata).
Noisy-Neighbor Mitigation¶
- Per-tenant budgets: CPU, IOPS, and QPS budgets enforced at gateway and service layer (token bucket + leaky bucket).
- Hot shard fairness: Adaptive shard balancing and back-pressure (HTTP 429 with
retry-after). - Query governors: Limit concurrent long-running queries; require query hints or preview for heavy scans.
- Export governors: Max concurrent export jobs per tenant; size-based chunking with incremental manifests.
governors:
qps:
perTenant: 500
perStream: 200
queries:
maxConcurrent: 8
maxRuntime: PT2M
requirePreviewAboveRows: 5_000_000
exports:
maxConcurrent: 2
maxBundleSize: 50GB
chunkSize: 2GB
Cost Levers (Top 10)¶
- Hot retention window (days in Hot before Warm).
- Warm index granularity (daily vs hourly partitions).
- Export frequency & size (number of eDiscovery/DSAR bundles).
- Cross-region reads/exports (egress).
- Cache TTL & hit rate (reduces Warm compute).
- Segment size/seal cadence (affects metadata overhead & verification cost).
- Index cardinality (number of fields indexed & distinct values).
- Compression & encoding (Parquet snappy/zstd; JSONL gzip).
- Rebuild-first vs snapshot retention for Warm.
- Purge cadence/batch size (storage & compute churn).
Quick Calculator (sketch)¶
Use this to estimate monthly cost per tenant before approval.
Inputs
events_per_day,avg_event_bytes,hot_days,warm_days,export_gb_per_mo,cross_region_gb_per_mo
Pseudo-formulas
hot_gb = events_per_day * avg_event_bytes * hot_days / (1024^3)
warm_gb = events_per_day * avg_event_bytes * warm_days / (1024^3) * warm_index_factor
storage_cost = hot_gb*rate_hot + warm_gb*rate_warm + archive_tb*rate_cold
egress_cost = cross_region_gb_per_mo * rate_egress
export_cost = export_gb_per_mo * rate_export_io
total = storage_cost + egress_cost + export_cost + verify_cost + rebuild_cost
Observability (Cost & Perf)¶
- Metrics
cost.estimate.usd,cost.egress.gb,storage.hot.gb,storage.warm.gb,storage.cold.gb,qps.per_tenant,queries.p95.ms,exports.bytes.total,cache.hit_rate.
- Dashboards
- Per-tenant cost overview, Top cost drivers, Hot shard hotspots, Cross-region attempts (blocked/allowed).
- Alerts
- Budget ≥ 80%, ≥ 95%; cache hit rate < 60%; p95 query > SLO; replication lag > 50% of RPO.
API Sketch (Advisory & Guard)¶
GET /cost/estimate?tenantId=...&window=P30D
GET /cost/levers?tenantId=...
POST /cost/budget/set { cap_gb: 200, cap_usd: 500 }
POST /guard/cross-region/estimate { bytes: 75GB, route: EU->US } # returns cost & approval required
Examples & Defaults¶
- Standard tier default knobs
hot_days=14,warm_days=90,warm_index_factor=0.35,cache_ttl=15mmax_long_query_runtime=2m,export_chunk=2GB,egress_cap=50GB/mo
- Enterprise fast-path
replicate_readacross sameRegionCode, query acceleration via nightly pre-aggregations,cache_ttl=30m.
Guardrails (quick checklist)¶
- Reads and exports must stay in-region unless explicitly approved with a cost estimate.
- Tier transitions follow policy gates and maintain immutability (no rewrites).
- Quotas and governors are per-tenant with clear soft/hard caps and burst limits.
- Query and export jobs are bounded, paged, and cancelable; heavy scans require preview.
- Cost telemetry is first-class: every cross-region attempt yields a cost record and policy decision.
Observability & Compliance Evidence¶
Everything that touches residency, retention, holds, integrity, DR, and exports emits measurable, queryable telemetry. Evidence is aggregated per tenant and region, producing auditor-ready packs on demand.
Metrics (golden set)¶
- Residency & Access
residency.placement.success.count/.fail.countabac.allow.count,abac.deny.count,break_glass.used.count
- Retention & Purge
retention.eligible.count,retention.purged.count,retention.redacted.countretention.archive.bytes,retention.lag.seconds
- Holds & DSAR
holds.active.count,holds.scope.bytesdsar.exports.count,dsar.deletes.count,dsar.denied.count,dsar.export.latency.seconds
- Integrity
integrity.anchors.created.count,integrity.verify.ok.count,integrity.verify.fail.countintegrity.tsa.latency.ms,integrity.chain.depth
- DR & Backups
replication.lag.seconds,dr.failover.count,backup.success.count,restore.success.count
- Cost & Egress (advisory)
cost.egress.gb,cost.estimate.usd,exports.bytes.total
Dimensions:
tenantId,regionCode,category,streamId,policyVersion,edition,op(read/write/export),result.
Logs (decision-grade)¶
All guard/retention/hold/DR decisions produce structured logs with stable fields:
{
"ts": "2025-10-29T08:31:22.115Z",
"tenantId": "7c1a-...",
"regionCode": "EU",
"service": "retention-worker",
"operation": "purge",
"category": "evidence.hot",
"decision": "allow|deny|quarantine",
"reason": "eligible|hold_active|export_verification_failed|cross_region_blocked",
"policyVersion": "3.1.0",
"claims": { "purpose": "default", "residency_profile": "gdpr-standard" },
"resource": { "streamId": "aud.gateway", "artifactId": "seg-000145" },
"correlationId": "6b3f...",
"ledgerId": "plg_01HT2...",
"proofRefs": ["A_000145"],
"durMs": 42
}
Privacy & integrity
- Logs contain no raw PII; sensitive fields are hashed/redacted per log policy.
- Integrity-sensitive logs (e.g., anchor/seal) include key lineage (
keyId,keyVersion) and are written to append-only sinks.
Events (control-plane)¶
Emit immutable events for key lifecycle points:
- Residency/ABAC:
abac.decision_denied,abac.policy_updated,break_glass.used - Retention:
retention.policy_changed,retention.window_elapsed,retention.purged,retention.redacted,retention.purge_blocked - Holds & DSAR:
hold.created,hold.released,dsar.export_completed,dsar.delete_completed - Integrity:
integrity.anchor_sealed,integrity.verify_passed,integrity.violation_detected - DR/Backup:
replication.lag_exceeded,dr.failover_initiated,backup.completed,restore.completed
Dashboards (tenant/region views)¶
- Residency & Access
- Allow/Deny trends, top deny reasons, cross-region attempts (blocked/allowed), break-glass usage with TTL and scope.
- Retention & Holds
- Eligible vs purged/redacted over time, active holds by type/bytes, DSAR exports latency percentiles, purge lag.
- Integrity & DR
- Anchor creation rate, verification pass/fail heatmap, TSA latency; replication lag vs RPO, drill outcomes vs targets.
- Cost
- Storage tiers (Hot/Warm/Cold) by tenant, egress GB & USD, export volumes by route.
Evidence Packs (auditor-ready)¶
Generated on demand or schedule, per tenant/region and time window.
Pack manifest (excerpt)
evidencePack:
id: evp_01HZ...
tenantId: 7c1a-...
region: westeurope
scope: { from: 2025-10-01, to: 2025-10-31 }
includes:
metrics:
- residency.placement.success.count
- retention.purged.count
- integrity.verify.ok.count
- dsar.exports.count
logs:
queries:
- name: residency_denies
filter: decision=="deny" AND operation IN ["read","export"]
- name: purge_decisions
filter: service=="retention-worker"
documents:
- type: policy
policyVersion: 3.1.0
- type: drillReport
id: drq3-2025
proofs:
- anchors: [ "A_000130..A_000145" ]
tsaReceipts: true
signatures:
keyId: "hsm-eu-01"
algorithm: "cms"
tsa: { type: rfc3161, token: "b64:..." }
createdAt: 2025-11-01T00:15:00Z
Delivery & immutability
- Stored in write-once class, encrypted with region KMS; delivered via in-region route.
- Each pack is signed; verifier CLI/API re-checks hashes and signatures offline.
Policy Snippets¶
observability:
pii:
redact: [ "email", "phone", "ssn" ]
hashAlgo: "sha256"
logs:
format: json
sink: append_only
retention: P180D
metrics:
cardinalityGuards:
maxTenantSeries: 5000
maxLabelValues: 200
evidencePacks:
schedule: "R/P1M" # monthly
scope: "previous_month"
sign: { cms: true, tsa: true }
delivery: in_region
Queries (examples)¶
- Active holds by scope (pseudo-KQL)
decisions
| where decision == "deny" and reason endswith "_blocked"
| summarize count() by reason, regionCode, tenantId
Evidence Generation Flow¶
flowchart TD
A[Schedule/On-Demand Request] --> B[Collect metrics/logs/events]
B --> C[Attach policy artifacts & drill reports]
C --> D[Resolve proofRefs & TSA receipts]
D --> E[Assemble manifest + hashes]
E --> F["Sign (CMS/HSM) + Timestamp (TSA)"]
F --> G[Store in write-once, deliver in-region]
Guardrails (quick checklist)¶
- Telemetry uses stable schemas, no raw PII, and region-local sinks.
- Every decision ties to a policy version, tenantId, regionCode, and correlationId.
- Evidence packs are signed, timestamped, and stored immutably; verification is offline-capable.
- Cardinality guards protect telemetry backends; dashboards remain fast and scoped per tenant/region.
- Break-glass and cross-region actions always produce explicit evidence entries.
Testing & Verification¶
Continuous verification proves that residency, retention, integrity, and DR controls behave exactly as declared. Tests run pre-deploy (CI), post-deploy (canary), and on a fixed cadence (drills), producing immutable evidence.
Strategy & Layers¶
- Unit/Contract — policy evaluation, ABAC rules, retention calculators, integrity hashing.
- Integration — region overlays, Private Link enforcement, export routes, KMS/HSM paths.
- Chaos/Resilience — KMS/TSA latency, storage throttling, replica lag, network partitions.
- Operational Drills — residency conformance, retention dry-runs, DR failover exercises.
Residency Conformance Suite¶
Use synthetic tenants per region with distinct residency profiles to verify allow/deny paths.
Fixtures (sketch)
tenants:
t-eu: { regionCode: EU, cloudRegion: westeurope, residencyProfileId: gdpr-standard }
t-us: { regionCode: US, cloudRegion: eastus, residencyProfileId: us-standard }
t-il: { regionCode: IL, cloudRegion: israelcentral, residencyProfileId: il-sovereign }
profiles:
gdpr-standard:
movement: replicate_read
allowedRegionCodes: [EU]
exportRoutes: [in_region]
us-standard:
movement: none
allowedRegionCodes: [US]
exportRoutes: [in_region, same_code]
Test matrix (examples)
| Case | Subject | Resource | Expect | Notes |
|---|---|---|---|---|
| Write in authoritative region | t-eu token (EU) | EU hot append | Allow | Write locality |
| Cross-region read EU→US | t-eu token (EU) | US warm index | Deny | Cross-family blocked |
| Cross-region read EU↔EU | t-eu token (EU) | EU-replica index | Allow | replicate_read allowed |
Export global under GDPR |
t-eu token | any | Deny | Route not allowed |
Missing data_silo_id claim |
any | any | Deny | ABAC attribute required |
| Break-glass with TTL | operator token + approval | cross-region read | Allow + event | Time-bound, least privilege |
Automation
- Synthetic traffic: write/read/export across regions; assert 403 GuardViolation vs 200 OK and inspect
X-Policy-*headers. - Cache revocation: update residency profile → ensure gateways/services flush decision caches and deny previously allowed paths.
Retention Simulations (Dry-Run)¶
Run dry-run purge windows to estimate impact before enforcement.
Process
- Select scope: tenant, category, time window, policyVersion.
- Compute eligibleAt for sampled artifacts; exclude active holds.
- Emit
retention.dryrun.statswith counts/bytes and per-mode breakdown. - Compare to prior runs; flag anomalies (>20% deviation).
API (sketch)
POST /retention/dryrun
{
"tenantId":"7c1a-...",
"category":"evidence.hot",
"policyVersion":"3.1.0",
"sampleRate":0.1,
"window":{"from":"2025-09-01","to":"2025-09-30"}
}
Expected output
{
"eligibleCount": 124_331,
"eligibleBytes": 1_834_221_912,
"blockedByHold": 1882,
"modeSplit": { "PURGE": 0, "REDACT_TOMBSTONE": 0, "ARCHIVE_THEN_PURGE": 124331 },
"lagSecondsP95": 840,
"policyVersion": "3.1.0"
}
Thresholds
eligibleByteschange vs prior month ≤ ±20% (warn above, investigate above ±35%).blockedByHold/eligibleCountratio > 10% triggers legal/data-ops review.
DR Drills (Failover Exercises)¶
Prove RPO/RTO and residency-safe posture with scripted region incidents.
Scenarios
- Read-only posture: promote read replicas without writes until approval.
- Writer promotion: time-bound approval → update catalog → resume writes.
- Failback: return to primary when healthy; ensure reconciliation and no data loss.
sequenceDiagram
autonumber
participant OP as DR Orchestrator
participant CAT as Residency Catalog
participant SVC as Services
participant VER as Verifier
OP->>SVC: Inject incident (region=EU)
SVC-->>OP: Health degraded
OP->>CAT: Set posture=read_only (replica=EU2)
OP->>SVC: Run synthetic reads (p95 check)
OP->>CAT: Approve promotion (TTL=4h)
CAT-->>SVC: Role change event
OP->>SVC: Write & read validation
OP->>VER: Data parity & anchor checks
VER-->>OP: Evidence (ok/fail)
Targets (examples)
| Tier | RPO | RTO | Drill cadence |
|---|---|---|---|
| Standard | ≤ 30m | ≤ 4h | Semiannual |
| Business | ≤ 15m | ≤ 2h | Quarterly |
| Enterprise | ≤ 5m | ≤ 30m | Quarterly (+ ad-hoc) |
Evidence
- Drill report with: timestamps, regions, RPO/RTO achieved, replication lag histogram, anchor verification results, approval records, and post-mortem if breached.
Chaos & Negative Tests¶
- KMS latency/outage → services must degrade safely (queue, retry, no plaintext).
- TSA timeout → anchors still seal; TSA receipt deferred with
integrity.tsa_invalidif not received in SLA. - Network egress blocked → exports remain in-region; cross-region attempts denied with cost/route reason.
- Hot shard surge → back-pressure (HTTP 429 +
retry-after), no SLO breach on other tenants.
CI/CD Gates & Canary¶
- Pre-deploy (PR): unit/contract tests, policy lints, schema diffs (
no breaking change). - Pre-prod canary: residency conformance for synthetic tenants in each region overlay.
- Post-deploy: retention dry-run sample; ABAC deny rate baseline; integrity verify sample.
- Automatic rollback on: deny spike > +5 pp, integrity failures > 0.1% sample, or DR canary lag > 50% RPO.
Pipeline gate (pseudo-YAML)
gates:
- name: residency-conformance
passIf: abac.deny.count{reason="cross_region_blocked"} >= 1 && abac.allow.count >= 1
- name: retention-dryrun
passIf: retention.dryrun.stats.count > 0 && retention.dryrun.anomaly == false
- name: integrity-sample
passIf: integrity.verify.fail.rate < 0.001
Reporting & Evidence Packs¶
- Test artifacts stored in write-once object storage with signatures.
- Monthly Verification Pack includes: conformance results, dry-run stats, drill reports, and selected logs/metrics snapshots; signed & timestamped.
Verification pack (manifest excerpt)
verificationPack:
id: vpk_2025_10
tenantScope: ["*"] # platform-wide with per-tenant slices
regions: ["westeurope","eastus","israelcentral"]
includes: [ "residencyTests", "retentionDryRun", "drDrill" ]
signatures: { keyId: "hsm-eu-01", tsa: true }
Metrics, Logs & Events (testing)¶
- Metrics:
test.residency.pass.count,test.residency.fail.count,test.retention.dryrun.bytes,test.dr.rpo.met.count,test.dr.rto.met.count. - Logs: each test logs
{suite, caseId, tenantId, regionCode, policyVersion, result, reason, durationMs}. - Events:
test.suite_started,test.case_failed,dr.drill_started,dr.drill_completed.
Guardrails (quick checklist)¶
- Synthetic tenants cover each region code and profile permutation.
- Dry-runs precede any retention policy rollout and are compared to historical baselines.
- DR drills are scripted, scheduled, and evidenced; failures trigger post-mortems and remediation tickets.
- Chaos tests confirm safe degradation without violating residency or integrity guarantees.
- CI/CD gates block deployment on policy or integrity regressions and produce auditor-ready artifacts.
Governance, Risk & Compliance Mapping¶
This section crosswalks platform controls to major frameworks and laws. It focuses on where data lives (residency), how long it stays (retention), how it’s protected (encryption & access), and how we prove it (observability & evidence).
GDPR (EU)¶
Key themes & articles → Platform controls
| GDPR focus | Article(s) | Platform control(s) | Evidence produced |
|---|---|---|---|
| Territorial scope & cross-border transfer | Art. 3, 44–49 | Residency profiles, RegionCode, export routes, ABAC cross-region deny-by-default, DR read-only posture |
Residency allow/deny logs, abac.deny.count, break-glass events |
| Storage limitation & data minimization | Art. 5(1)©(e) | Retention policy-as-code, tiering gates, dry-run simulations, purge/redact modes | retention.* metrics, purge ledgers, signed archive manifests |
| Integrity & confidentiality | Art. 32 | Envelope encryption (tenant/region KEKs), HSM anchors, mTLS mesh, Private Link | KMS/HSM audit logs, key rotation dashboards, anchor signatures |
| Privacy by design/default | Art. 25 | Deny-by-default ABAC, purpose binding, least-privilege tokens, redaction templates on exports | ABAC decision logs, policy versioning, redaction maps |
| Data subject rights (access/erasure/portability) | Arts. 15, 17, 20 | DSAR pipeline, residency-aware in-region export, erasure via mode-aware delete | DSAR manifests (NDJSON/Parquet), dsar.* events, deletion maps |
| Records of processing (ROPA) | Art. 30 | Residency & retention catalogs, policy registry, export schema registry | Catalog snapshots in evidence packs |
| DPIA triggers & risk mgmt | Art. 35 | DPIA checklist on new geographies/features; exceptions register; change-control ADRs | DPIA record, ADR links in evidence packs |
DPIA triggers (use before enabling)
- New cross-region route, new export destination, new data category (PII/health/financial), or new profiling/aggregation flow.
Third-country transfers
- Only via explicit export routes plus contractual bases (SCCs/IDTA); technical measures: in-region encryption keys, no cross-border unwrap.
HIPAA (US)¶
Security Rule mapping (selected safeguards)
| Safeguard | 45 CFR | Platform control(s) | Evidence |
|---|---|---|---|
| Risk analysis & management | 164.308(a)(1) | Risk register, DPIA-like reviews, residency catalog | Risk log, review cadence in evidence packs |
| Workforce security & training | 164.308(a)(3) | Role-based access, purpose binding, break-glass approvals | Access reviews, approval artifacts |
| Access control | 164.312(a) | ABAC, per-tenant dataSiloId, deny-by-default |
ABAC logs, abac.allow/deny metrics |
| Audit controls | 164.312(b) | Structured decision logs, append-only sinks, immutable ledgers | Log integrity checks, retention of logs |
| Integrity | 164.312© | Hash-chained segments, signed anchors, TSA receipts | integrity.* metrics, verification reports |
| Person/entity authentication | 164.312(d) | OIDC/workload identities, mTLS, SPKI pinning | AuthN logs, cert rotation logs |
| Transmission security | 164.312(e) | mTLS mesh, Private Link, egress allow-lists | Network policy attestations |
| Contingency plans (backup/DR) | 164.308(a)(7) | Region-coherent backups, DR runbooks, read-only failover | Backup/restore manifests, DR drill reports |
BAA considerations
- Define responsibility matrix (key custody, DSAR handling, breach notices), restrict PHI exports to approved in-region routes, and document MFA delivery of eDiscovery artifacts.
SOC 2 (Trust Services Criteria) & ISO/IEC 27001¶
Control crosswalk (high level)
| Domain | SOC 2 TSC | ISO/IEC 27001:2022 (examples) | Platform control(s) |
|---|---|---|---|
| Security (Common Criteria) | CC1–CC9 | A.5 Organizational, A.8 Technological | ABAC, least privilege, decision logs, change control |
| Availability | A1 | A.5.30, A.5.31 (ICT continuity) | DR tiers, RPO/RTO, drill evidence |
| Processing Integrity | PI1–PI4 | A.5.34 (data masking), A.8 (secure coding/tech) | Integrity chains, verification on read/schedule |
| Confidentiality | C1–C2 | A.8.24 (use of cryptography), A.5.10 (information deletion) | Envelope encryption, KEK rotation, retention & purge |
| Privacy | P1–P6 | A.5.36 (privacy by design), A.5.12 (PII in logs) | DSAR pipeline, redaction, log policies (no raw PII) |
| Change control | CC8.1 | A.5.23 (change management) | ADRs, CI/CD gates, policy versioning, dry-runs |
| Logging & monitoring | CC7 | A.8.15 (monitoring activities) | Metrics, structured logs, evidence packs |
Change management artifacts
- ADRs for residency routes, encryption algorithms, key rotation cadence, DR modes; gates in CI/CD (policy lints, conformance tests).
- Exceptions registry with time-boxed approvals (e.g., break-glass).
Control Owners & RACI (example)¶
| Control | Owner | Approver | Informed |
|---|---|---|---|
| Residency profile changes | Platform Security | DPO/Legal | SRE, Tenant Success |
| Retention policy changes | Data Governance | Legal | SRE, Product |
| Key rotation cadence | Security Ops | CISO delegate | Platform Security, SRE |
| DR posture & drills | SRE | Platform Security | Product, Legal |
| DSAR handling | DPO | Legal | Security, Tenant Success |
Policy Registry (pointers)¶
policies:
residency: version: 3.1.0, owners: [Platform Security, Legal]
retention: version: 3.1.0, owners: [Data Governance, Legal]
keys: version: 2.4.2, owners: [Security Ops]
abac: version: 2.2.1, owners: [Platform Security]
ediscovery:version: 1.7.0, owners: [Legal, DPO]
dr: version: 2.0.0, owners: [SRE]
Risk Register (snapshot template)¶
| Risk | Description | Inherent | Treatment | Residual | Control references |
|---|---|---|---|---|---|
| Cross-border leakage | Unapproved EU→US export | High | ABAC + export routes deny-by-default; break-glass with TTL | Low | Residency profiles, ABAC logs |
| Key compromise | Tenant KEK exposure | High | HSM custody, dual-approval, escrow in-jurisdiction | Low | KMS/HSM logs, rotation evidence |
| DR promotion error | Writes in disallowed region | Medium | Read-only posture until legal approval; reversible promotion | Low | DR runbooks, event trail |
| Excess retention | Data kept past window | Medium | Policy-as-code, dry-run previews, purge ledgers | Low | retention.* metrics, audits |
Evidence & Reporting¶
- Auditor-ready packs (monthly/quarterly): residency decisions, retention outcomes, DSAR/export logs, DR drills, key events, policy versions.
- Signatures & TSA: packs and manifests are CMS-signed; optional RFC 3161 timestamps; stored in write-once class and delivered in-region.
Change Control & Exceptions¶
- Normal: ADR + PR with green gates (residency tests, retention dry-run, integrity sample).
- Emergency (break-glass): dual approval, scoped TTL, full audit trail, post-mortem within 5 business days.
Guardrails (quick checklist)¶
- Residency and export routes are policy-driven; cross-family flows require explicit legal basis.
- Retention is bounded and enforced; legal holds always override deletes.
- Encryption keys are region-anchored; no cross-border unwrap.
- DR promotions default to read-only until legal approval; drills produce measurable RPO/RTO evidence.
- Logs and packs contain no raw PII; everything is versioned, signed, and provable.
Migration & Evolution¶
Migrations move tenants between regions without breaking immutability or residency guarantees. Evolution changes policies and schemas in an additive-first way. Both produce evidence and have a clear rollback.
Region Migration Playbook (authoritative writer move)¶
Goals
- Zero data loss, minimal downtime; single-writer guarantee preserved.
- Residency-compliant: no disallowed cross-border flows; all steps audited.
Phases
-
Plan
- Validate ResidencyProfile permits destination
RegionCode(or obtain legal approval). - Generate scope manifest: tenant, streams, expected bytes/segments, keys, holds, DSAR in-flight.
- Pre-create target resources (VNet, KMS/HSM keys, storage, indexes) with tags & policy version.
- Validate ResidencyProfile permits destination
-
Pre-copy & Warm
- Copy hot segments and anchors as of T0 to target region (private endpoints only).
- Warm indexes in target by rebuilding from hot; do not accept reads yet.
- Run integrity verification (Merkle roots, anchor signatures, TSA receipts) on a sample ≥ 10%.
-
Freeze & Catch-up
- Freeze writes at a segment boundary (short maintenance window).
- Emit last source anchor
A_n; copy delta segments since T0. - Verify end-to-end consistency (counts, bytes, hashes).
-
Bridge & Cutover
- Create Bridge Anchor in target referencing
R_n(see Evidence Immutability & Integrity). - Update Residency Catalog (atomically): set authoritative
CloudRegion = target,policyVersion++. - Flip Gateway routing → target region; hold source region read-only for grace period.
- Create Bridge Anchor in target referencing
-
Monitor & Reconcile
- Monitor p95 latencies, error budget, and replication lag (should be zero).
- Reconcile warm/read models and caches; drain old endpoints.
-
Decommission
- After grace TTL expires and audits pass, decommission source artifacts (per retention).
- Keep source anchors and migration manifest per policy.
flowchart TD
P[Plan] --> W[Pre-copy & Warm]
W --> F[Freeze & Catch-up]
F --> B[Bridge Anchor]
B --> C[Catalog Cutover]
C --> M[Monitor & Reconcile]
M --> D[Decommission]
Downtime strategy
- Writes paused only during Freeze & Catch-up (segment boundary). Reads can continue from source; optional read-only mirror in target for smoke tests.
Rollback
- If post-cutover checks fail within grace window: set catalog back to source → route writes to source → investigate. Bridge anchor remains as provenance; a reverse bridge may be issued if needed.
Migration manifest (excerpt)
migration:
tenantId: 7c1a-...
from: westeurope
to: northeurope
plannedAt: 2025-11-05T00:00:00Z
freezeAt: 2025-11-05T03:00:00Z
lastSourceRoot: "b64:R_n"
bridgeAnchorId: "A_bridge"
hot:
segments: { from: seg-000120, to: seg-000188 }
bytes: 4_231_009_882
warm:
rebuilt: true
verification:
leavesChecked: 3_482_901
anchorsChecked: 69
result: pass
catalogUpdateId: "cat_01JK..."
approvals: ["legal_approved", "security_change_window"]
policyVersion: 3.2.0
Policy snippet
migrationPolicy:
allowedRegionCodes: [ EU ] # example for GDPR tenants
require:
bridgeAnchor: true
freezeAtBoundary: true
dualApproval: true
evidencePack: true
gracePeriod: PT4H
rollbackWindow: PT4H
Policy Evolution (residency & retention catalogs)¶
Evolve policies with versioned catalogs and stricter-first principles. Every decision logs the policy version.
- Catalogs
residencyCatalogvN:tenantId → {CloudRegion, RegionCode, ResidencyProfileId, DataSiloId}retentionCatalogvN: defaults/bounds, tenant and stream overrides.
- Change types
- Additive (safe by default): new profiles/regions/routes; new categories; stricter windows.
- Tightening: shorter windows for new & eligible existing data (see “Stricter-Only Evolution”).
- Breaking (rare): export route reductions, algorithm deprecations → require ADR + dual approval + staged rollout.
- Rollout path
- Lint & dry-run in CI (retention dry-run, ABAC conformance).
- Canary in one region/tenant cohort.
- Staged enablement with decision telemetry watch (deny/allow rates within baseline).
- Version pinning for services; upgrade only when evidence is green.
- Deprecations
- Publish end-of-support dates; provide auto-translation maps (e.g.,
same_code → in_region).
- Publish end-of-support dates; provide auto-translation maps (e.g.,
Policy registry update
policies:
retention: { version: 3.2.0, staged: ["westeurope"], target: "global", dryRun: true }
residency: { version: 3.2.0, effectiveAt: "2025-11-10T00:00:00Z" }
Backwards-Compatibility & Reindexing (minimal downtime)¶
Warm/read models change more often than hot evidence. Use online reindex with alias switch and idempotent replay from hot.
Pattern
- Build new index (vNext) in target region with new schema.
- Backfill by replaying hot segments (idempotent projections).
- Dual-write only if policy permits and needed (prefer not; otherwise buffer mutations until swap).
- Compare counts, key stats, sampled queries (p95, top N).
- Swap alias
current → vNextatomically; keepvPrevfor rollback TTL. - Retire vPrev after TTL if parity holds.
Index alias example
index:
name: "aud_gateway"
aliases:
current: "aud_gateway_v3"
write: "aud_gateway_v3"
next:
build: "aud_gateway_v4"
swapWhen:
parityDeltaPctMax: 1.0
p95SlowdownPctMax: 5.0
rollbackTtl: P7D
Reprojection & compaction
- Replayers consume purge/redact events to keep vNext consistent during backfill.
- Compactors merge small segments to meet vNext partitioning rules.
Online schema guards
- New fields are additive; removals require mapping/derivation.
- Feature flags gate query paths to avoid mixed-schema surprises.
Change Control & Evidence¶
- ADRs for migrations, DR posture changes, encryption/algorithm updates.
- Tickets & approvals linked to manifests and evidence packs.
- Events:
migration.started,migration.bridge_created,migration.cutover,migration.rollback,catalog.updated,index.swap_completed.
Metrics
migration.bytes.copied,migration.freeze.seconds,migration.cutover.latency.ms,index.backfill.rate.rows_s,index.parity.delta_pct,policy.version.adoption.rate.
Logs
- Structured entries for each phase with
{tenantId, from, to, policyVersion, counts, bytes, reason, correlationId}.
Guardrails (quick checklist)¶
- Single authoritative writer; cross-region writes only during approved cutover window.
- Bridge anchor and provenance manifest are mandatory; integrity verified end-to-end.
- Catalog updates are atomic and versioned; gateways/services revalidate immediately.
- Policy evolution is additive-first with dry-run and canary stages; every decision stamps policyVersion.
- Reindexing uses online swap with rollback TTL; projections are idempotent and purge-aware.
- Rollback is defined, tested, and time-bound; all steps produce auditor-ready evidence.
Appendix A — Example Policy Bundle (Sketch)¶
policyVersion: 3.3.0
residency:
regionCode: EU # Jurisdiction umbrella
cloudRegion: westeurope # Authoritative writer region
residencyProfileId: gdpr-standard
allowedRegionCodes: [ EU ] # Cross-family flows denied by default
exportRoutes: [ in_region ] # Allowed delivery routes for eDiscovery/DSAR
movement:
replicate: replicate_read # Read-only replicas within same RegionCode
failoverPosture: read_only # Default posture during DR until approval
migrationApproval: required # Dual approval for region moves
network:
privateEndpointsOnly: true
crossRegionEgress: deny
retention:
defaults:
evidence.hot: { window: P7Y, mode: ARCHIVE_THEN_PURGE }
evidence.manifest: { window: P10Y, mode: REDACT_TOMBSTONE }
readmodel.warm: { window: P90D, mode: PURGE }
archive.cold: { window: P10Y, mode: PURGE }
integrity.proof: { window: P15Y, mode: REDACT_TOMBSTONE }
export.bundle: { window: P2Y, mode: PURGE }
bounds:
min:
evidence.hot: P1Y
readmodel.warm: P30D
max:
evidence.hot: P10Y
archive.cold: P15Y
integrity.proof: P20Y
exceptions:
legalHold: true
regulatorExtension: true
investigationHold: true
dryRun:
enabled: true
sampleRate: 0.05
holds:
dualApproval: true
regionScope: authoritative_only
overlapStrategy: additive
dsar:
export:
formats: [ ndjson, parquet ]
route: in_region
redactTemplate: pii-default
sign: { cms: true, tsa: true }
delete:
modesByCategory:
evidence.hot: REDACT_TOMBSTONE
integrity.proof: REDACT_TOMBSTONE
readmodel.warm: PURGE
abac:
default: deny
requirePurpose: true
breakGlass:
enabled: true
ttl: PT4H
dualApproval: true
keys:
tenantScoped: true
regionAnchored: true
algorithms: { encrypt: AES-256-GCM, wrap: RSA-OAEP-256, sign: ECDSA-P256-SHA256 }
rotation: { dek: P1D, kek: P90D, sunset: P7D }
escrow:
enabled: true
jurisdiction: EU
dr:
mode: active_passive
writerRegion: westeurope
replicas:
- region: northeurope
scope: [ warm, cold ]
rpo: PT15M
rto: PT2H
promotion:
defaultPosture: read_only
approvalRequired: true
backup:
regionLocal: true
hot: { full: P1D, incremental: PT1H }
warm: { strategy: rebuild_first }
cold: { immutability: true, lifecycle: { toArchiveAfter: P30D } }
observability:
logs:
format: json
sink: append_only
retention: P180D
evidencePacks:
schedule: R/P1M
sign: { cms: true, tsa: true }
costGuards:
crossRegion:
default: deny
estimateBeforeExecute: true
tiering:
hot: { targetWindow: P14D }
warm: { targetWindow: P90D, rebuildFirst: true }
cold: { storageClass: archive_immutable }
Appendix B — Purge Decision Table (Sketch)¶
| Condition | Action | Outcome / Events |
|---|---|---|
| Active legal hold | Block | 409 Conflict, emit retention.purge_blocked{reason=hold_active} |
| Regulator extension applies | Block or Redact | Deny purge or allow REDACT_TOMBSTONE per directive |
| DSAR delete requested and no holds | Apply DSAR mode | dsar.delete_completed (redact/purge by category), log decision |
| Cutoff not reached | Skip | No-op; schedule next window, retention.window_pending |
| Eligible with export flag | Export+Purge | Emit manifest; verify checksums/signature; then retention.purged |
| Export verification failed | Rollback & Block | retention.export_verification_failed; requeue batch with backoff |
| Approval required but missing | Block | retention.approval_pending; await dual approval |
| Residency/route would be violated | Block | retention.purge_blocked{reason=route_denied} |
| Storage precondition (ETag/generation) mismatch | Retry | Idempotent retry with jitter; circuit-break on repeated provider errors |
| All checks passed (PURGE mode) | Physical delete | retention.purged; write Purge Ledger → commit |
| All checks passed (REDACT_TOMBSTONE mode) | Redact + tombstone | retention.redacted; lineage preserved with ledgerId and proofRefs[] |
Pseudocode (evaluation order, sketch)
def decide_purge(artifact, policy, holds, dsar, approvals, residency):
if holds.active_for(artifact): return Block("hold_active")
if residency.route_denied_for(artifact): return Block("route_denied")
mode = dsar.mode_for(artifact) or policy.mode_for(artifact)
if not policy.eligible(artifact): return Skip("window_pending")
if policy.requires_approval(artifact) and not approvals.ok(): return Block("approval_pending")
if mode == "ARCHIVE_THEN_PURGE":
manifest = export_and_sign(artifact)
if not verify(manifest): return Block("export_verification_failed")
return Purge(manifest)
if mode == "REDACT_TOMBSTONE": return RedactWithTombstone()
return Purge()
Notes
- Every mutation is preceded by a Purge Ledger entry and emits structured decision logs with
policyVersion,tenantId,regionCode,category,ledgerId, and reason codes. - All actions are region-local and respect network boundaries and rate limits.