Message Schemas - Audit Trail Platform (ATP)¶
Messages as contracts — ATP's message schemas define precise, versioned, and testable async contracts with AsyncAPI specifications, JSON Schema validation, and guaranteed delivery semantics for event-driven architecture.
📋 Documentation Generation Plan¶
This document will be generated in 12 cycles. Current progress:
| Cycle | Topics | Estimated Lines | Status |
|---|---|---|---|
| Cycle 1 | AsyncAPI Fundamentals & Message Architecture (1-2) | ~2,500 | ⏳ Not Started |
| Cycle 2 | Ingestion Domain Events (3-4) | ~2,500 | ⏳ Not Started |
| Cycle 3 | Storage & Stream Domain Events (5-6) | ~2,500 | ⏳ Not Started |
| Cycle 4 | Classification & PII Events (7-8) | ~2,000 | ⏳ Not Started |
| Cycle 5 | Retention & Lifecycle Events (9-10) | ~2,000 | ⏳ Not Started |
| Cycle 6 | Policy & Tenant Events (11-12) | ~2,000 | ⏳ Not Started |
| Cycle 7 | Export & Integrity Events (13-14) | ~2,000 | ⏳ Not Started |
| Cycle 8 | Integration Events Catalog (15-16) | ~2,500 | ⏳ Not Started |
| Cycle 9 | Message Envelope & Metadata (17-18) | ~2,000 | ⏳ Not Started |
| Cycle 10 | Schema Versioning & Evolution (19-20) | ~2,000 | ⏳ Not Started |
| Cycle 11 | Message Reliability & Patterns (21-22) | ~2,000 | ⏳ Not Started |
| Cycle 12 | Testing & Observability (23-24) | ~2,000 | ⏳ Not Started |
Total Estimated Lines: ~26,000
Purpose & Scope¶
This document provides complete AsyncAPI specifications for all asynchronous message contracts in the Audit Trail Platform (ATP), defining domain events, integration events, message envelopes, schemas, routing rules, delivery guarantees, and versioning strategies to ensure reliable, traceable, and compliant event-driven communication across all ATP services and external integrations.
Key Message Contract Principles - AsyncAPI 2.6+: Industry-standard async API specification format - JSON Schema: Message payload validation with strict typing - Immutable Messages: Events never change after publishing - Versioned Schemas: All messages have explicit schema versions - At-Least-Once Delivery: Messages guaranteed delivered, may duplicate - Idempotency Required: All consumers must handle duplicate messages - Ordered Within Partition: Messages ordered by partition key (TenantId) - Cloud Events Standard: Envelope follows CloudEvents specification
What this document covers
- Establish ATP's message architecture: Event bus topology, topics, subscriptions, routing
- Define AsyncAPI specifications: Complete specs for all message types with operations, channels, schemas
- Document domain events catalog: 50+ events within bounded contexts with complete schemas
- Specify integration events catalog: 20+ cross-context events with published language
- Detail message envelope structure: Metadata, headers, correlation, causation per CloudEvents
- Describe message routing: Topic architecture, subscription filters, dead-letter queues
- Outline schema versioning: Evolution rules, compatibility, migration strategies
- Document delivery guarantees: At-least-once, ordering, idempotency, retry, DLQ
- Specify message serialization: JSON (primary), Protocol Buffers, Avro (alternatives)
- Detail message validation: Schema validation, envelope validation, content validation
- Describe message observability: Distributed tracing, metrics, monitoring, debugging
- Document message testing: Schema validation tests, contract tests, replay tests
Out of scope (referenced elsewhere)
- REST API specifications (see rest-apis.md)
- Webhook specifications (see webhooks.md)
- Domain model and aggregates (see ../aggregates-entities.md)
- Event catalog overview (see ../events-catalog.md)
- Azure Service Bus infrastructure (see ../../implementation/messaging.md)
- Outbox/Inbox patterns (see ../../implementation/outbox-inbox-idempotency.md)
Readers & ownership
- Backend Developers (owners): Message publishing, handling, schema design, implementation
- Integration Engineers: Cross-service integration, message routing, external system integration
- Architects: Event-driven architecture, message flow design, bounded context integration
- QA/Test Engineers: Message contract testing, schema validation, replay testing
- Operations/SRE: Message monitoring, throughput analysis, DLQ management, incident response
- Compliance/Audit: Message audit trails, retention policies, compliance evidence
Artifacts produced
- AsyncAPI Specifications: Complete AsyncAPI 2.6 specs for all message types
- JSON Schema Definitions: Payload schemas for all events with validation rules
- Message Catalog: All domain and integration events with metadata
- Routing Configuration: Azure Service Bus topics, subscriptions, filters
- Envelope Specification: CloudEvents-compliant message envelope structure
- Schema Registry Setup: Centralized schema storage with versioning
- Validation Framework: Schema validation at publish and consume
- Message Flow Diagrams: Producer-consumer flows with sequence diagrams
- Versioning Guide: Schema evolution rules and migration procedures
- Contract Tests: Automated tests for all message schemas
- Monitoring Dashboards: Message metrics, throughput, latency, errors
- Troubleshooting Guides: Common message issues and resolutions
Acceptance (done when)
- All domain events (50+) have complete AsyncAPI specifications with JSON Schema payloads
- All integration events (20+) have complete specifications with routing rules
- Message envelope follows CloudEvents standard with all required metadata
- AsyncAPI document is valid and can be rendered in AsyncAPI tools
- Schema validation is implemented for all publishers and consumers
- Routing rules are documented for all topics and subscriptions in Azure Service Bus
- Versioning strategy is documented with compatibility rules and migration examples
- Delivery guarantees are specified: at-least-once, ordering, idempotency requirements
- Message monitoring is operational with metrics, dashboards, and alerts
- Contract tests validate all message schemas in CI/CD pipelines
- Troubleshooting guide covers common message issues and debugging procedures
- Documentation complete with comprehensive examples, diagrams, code samples, and cross-references
Detailed Cycle Plan¶
CYCLE 1: AsyncAPI Fundamentals & Message Architecture (~2,500 lines)¶
Topic 1: AsyncAPI and Message Architecture¶
What will be covered:
- What is AsyncAPI?
- AsyncAPI specification overview
- AsyncAPI vs OpenAPI (async vs sync)
- AsyncAPI structure (info, servers, channels, operations, components)
- AsyncAPI benefits (documentation, code generation, validation)
-
AsyncAPI tooling ecosystem
-
Why AsyncAPI for ATP?
- Document async message contracts
- Generate code from specifications
- Validate messages against schemas
- Enable contract-first development
-
Provide discoverable documentation
-
AsyncAPI Specification Structure
asyncapi: '2.6.0' info: title: ATP Message Contracts version: '1.0.0' servers: production: url: atp-servicebus.servicebus.windows.net protocol: amqp channels: audit.events.received: publish: message: $ref: '#/components/messages/AuditEventReceived' components: messages: AuditEventReceived: payload: $ref: '#/components/schemas/AuditEventReceivedPayload' schemas: AuditEventReceivedPayload: type: object properties: ... -
ATP Message Architecture
- Azure Service Bus: Message backbone (topics, subscriptions, queues)
- Topics: Logical channels for event categories
- Subscriptions: Consumer-specific message streams with filters
- Dead-Letter Queues: Failed message handling
-
Message Sessions: Ordered delivery within session
-
ATP Topic Architecture
Azure Service Bus Topics ├── atp.ingestion.events # Ingestion domain events │ ├── storage-service-sub # Storage service subscription │ └── classification-service-sub ├── atp.storage.events # Storage domain events │ ├── query-service-sub │ └── integrity-service-sub ├── atp.classification.events # Classification events │ ├── query-service-sub │ └── retention-service-sub ├── atp.policy.events # Policy events │ ├── all-services-sub # Broadcast to all ├── atp.export.events # Export events │ └── webhook-service-sub └── atp.integration.events # Cross-context events └── external-integrations-sub -
Message Flow Patterns
- Fan-Out: One event, multiple consumers (pub/sub)
- Competing Consumers: Multiple instances, one processes
- Priority Queue: High-priority events processed first
-
Request-Reply: Async request with correlation (rare)
-
Message Delivery Guarantees
- At-Least-Once: Every message delivered at least once (may duplicate)
- Ordered Delivery: Within partition/session key
- Durable: Messages persisted until acknowledged
-
TTL: Messages expire after time-to-live
-
CloudEvents Standard
- W3C CloudEvents specification compliance
- Standard envelope attributes (id, source, type, time)
-
Extension attributes for ATP (tenantId, correlationId)
-
Message Size Limits
- Azure Service Bus: 256KB per message (Standard), 1MB (Premium)
- ATP recommendation: < 64KB for performance
- Large payloads: Reference to blob storage
Code Examples: - Complete AsyncAPI specification (excerpt) - Topic configuration (Azure Service Bus) - Subscription with filters - CloudEvents envelope (JSON) - Message publishing code (C#) - Message consuming code (C#)
Diagrams: - AsyncAPI specification structure - ATP message architecture - Azure Service Bus topology - Message flow patterns - CloudEvents envelope structure
Deliverables: - AsyncAPI fundamentals guide - ATP message architecture spec - Topic and subscription design - CloudEvents envelope specification
Topic 2: Message Envelope and Metadata¶
What will be covered:
-
CloudEvents Envelope Structure
{ "specversion": "1.0", "id": "01HQZXYZ123456789ABCDEF", "source": "atp://ingestion-api", "type": "com.connectsoft.atp.ingestion.AuditEventReceived", "time": "2024-10-30T10:30:00Z", "datacontenttype": "application/json", "subject": "tenant-abc-123/event-xyz-789", "data": { /* event payload */ }, "tenantid": "tenant-abc-123", "correlationid": "corr-abc-123", "causationid": "event-prev-456", "schemaversion": "1.0.0" } -
Required Envelope Attributes (CloudEvents Standard)
- specversion: CloudEvents version ("1.0")
- id: Unique event identifier (ULID format)
- source: Event producer URI (atp://service-name)
- type: Event type (fully qualified, reverse domain)
-
time: Event timestamp (ISO 8601, UTC)
-
Optional Standard Attributes
- datacontenttype: Payload format (application/json, application/protobuf)
- dataschema: Schema URI for payload validation
-
subject: Context-specific subject identifier
-
ATP Extension Attributes
- tenantid: Tenant context (multi-tenancy isolation)
- correlationid: Request correlation across services
- causationid: Event that caused this event (event chain)
- schemaversion: Message schema version (1.0.0, 2.1.3)
- aggregateid: Aggregate instance identifier
- aggregatetype: Aggregate type name
-
aggregateversion: Aggregate version (optimistic concurrency)
-
Event ID Generation (ULID)
- ULID format: 26 characters, time-ordered, unique
- Benefits: Sortable by time, globally unique, compact
- Generation algorithm
-
Collision prevention
-
Timestamp Handling
- Server-side generation (UTC, monotonic)
- ISO 8601 format:
2024-10-30T10:30:00.123Z - Timestamp precision (milliseconds)
-
Clock skew handling
-
Correlation and Causation
- CorrelationId: Tracks entire request flow across services
- CausationId: Tracks event chains (Event A caused Event B)
- Propagation through distributed system
-
Tracing and debugging with correlation
-
Tenant Context Propagation
- TenantId in every message (no default tenant)
- Multi-tenancy enforcement
- Tenant validation at consume
-
Cross-tenant message prevention
-
Message Headers (Azure Service Bus)
- User properties: Custom metadata
- System properties: MessageId, EnqueuedTime, SequenceNumber
- Session properties: SessionId for ordering
- Correlation properties: CorrelationId, ReplyTo
Code Examples: - Complete CloudEvents envelope (JSON) - ULID generation code (C#) - Correlation ID propagation - Tenant context extraction - Azure Service Bus message with properties
Diagrams: - CloudEvents envelope structure - Correlation and causation flow - Tenant context propagation - Message header mapping
Deliverables: - CloudEvents envelope specification - Metadata standards - Correlation guide - Tenant context requirements
CYCLE 2: Ingestion Domain Events (~2,500 lines)¶
Topic 3: AuditEventReceived Event¶
What will be covered:
- Event Overview
- Name:
com.connectsoft.atp.ingestion.AuditEventReceived - Type: Domain Event
- Bounded Context: Ingestion
- Producer: Ingestion API Service
- Consumers: Storage Service, Classification Service, Monitoring Service
-
Description: Published when an audit event is successfully ingested and validated
-
AsyncAPI Specification
channels: atp/ingestion/events/received: publish: operationId: publishAuditEventReceived summary: Publish audit event received notification description: | Published when an audit event is successfully ingested, validated, and accepted for processing. Consumers should store the event, apply classification, and update read models. message: $ref: '#/components/messages/AuditEventReceived' tags: - name: ingestion - name: domain-event bindings: amqp: expiration: 86400000 # 24 hours TTL priority: 5 components: messages: AuditEventReceived: name: AuditEventReceived title: Audit Event Received summary: An audit event was successfully ingested contentType: application/json payload: $ref: '#/components/schemas/AuditEventReceivedPayload' correlationId: location: $message.header#/correlationId examples: - name: UserLoginEvent summary: User login audit event payload: eventId: "01HQZXYZ123456789ABCDEF" tenantId: "tenant-abc-123" timestamp: "2024-10-30T10:30:00.123Z" actor: {...} action: "User.Login" resource: {...} schemas: AuditEventReceivedPayload: type: object required: - eventId - tenantId - timestamp - actor - action - resource properties: eventId: type: string description: Unique event identifier (ULID format) pattern: "^[0-9A-HJKMNP-TV-Z]{26}$" example: "01HQZXYZ123456789ABCDEF" tenantId: type: string description: Tenant identifier pattern: "^tenant-[a-z0-9-]{8,}$" example: "tenant-abc-123" timestamp: type: string format: date-time description: When the action occurred (UTC) example: "2024-10-30T10:30:00.123Z" actor: $ref: '#/components/schemas/Actor' action: type: string description: Action performed (verb.noun format) example: "User.Login" resource: $ref: '#/components/schemas/Resource' context: type: object description: Additional contextual information additionalProperties: true payload: type: object description: Event-specific data (may be encrypted) additionalProperties: true correlationId: type: string format: uuid description: Request correlation identifier sourceSystem: type: string description: System that emitted the event example: "identity-service" -
Message Schema Details
- Every property documented with description, type, format, example
- Required vs optional fields
- Validation rules (patterns, min/max, enums)
- Nested object schemas (Actor, Resource)
-
Extensions for future compatibility
-
Producer Implementation
- When to publish (after successful validation and acceptance)
- How to construct message (CloudEvents envelope + payload)
- Where to publish (topic: atp.ingestion.events)
- Error handling (publish failures)
-
Transactional outbox pattern
-
Consumer Implementation
- Subscribing to topic
- Schema validation on receive
- Idempotency checking (eventId)
- Business logic processing
-
Acknowledgment or dead-letter
-
Routing Rules
- Topic:
atp.ingestion.events - Subscription filters: All consumers receive (no filtering)
- Session ID: TenantId (ordered per tenant)
-
Priority: Normal (5)
-
Delivery Guarantees
- At-least-once delivery
- Ordered within tenant (session-based)
- Idempotency required on consumer
- TTL: 24 hours
- Retry: 3 attempts with exponential backoff
-
DLQ: After max retries
-
Monitoring and Observability
- Metrics: Publish rate, consume rate, lag
- Tracing: Correlation ID propagation
- Logging: Event lifecycle logs
- Alerts: Publish failures, consumer lag, DLQ growth
Code Examples: - Complete AsyncAPI spec for AuditEventReceived - Full JSON Schema with all validations - Producer code (C# with MassTransit/NServiceBus) - Consumer code with idempotency - Schema validation code - Azure Service Bus configuration
Diagrams: - AuditEventReceived flow (sequence diagram) - Producer-consumer architecture - Routing topology - Retry and DLQ flow
Deliverables: - Complete AsyncAPI spec for AuditEventReceived - JSON Schema definition - Producer/consumer implementation guide - Routing configuration - Monitoring setup
Topic 4: Ingestion Validation and Batch Events¶
What will be covered:
- AuditEventValidationFailed Event
- Complete AsyncAPI specification
- Schema with validation errors array
- Producer: Ingestion API (validation logic)
- Consumers: Monitoring, Alerting, Analytics
-
Use case: Track validation failures, improve data quality
-
BatchIngestionStarted Event
- Batch ingestion initiated
- Schema: BatchId, TenantId, EventCount, Source
- Producer: Batch Ingestion Service
-
Consumers: Monitoring, Progress Tracking
-
BatchIngestionCompleted Event
- Batch processing finished
- Schema: BatchId, SuccessCount, FailureCount, Duration, Errors
- Producer: Batch Ingestion Service
-
Consumers: Monitoring, Analytics, Reporting
-
BatchIngestionFailed Event
- Entire batch failed
- Schema: BatchId, FailureReason, ErrorDetails
- Producer: Batch Ingestion Service
- Consumers: Alerting, Incident Management
Complete specifications for each event with same detail level as Topic 3
Deliverables: - AsyncAPI specs for 4 ingestion events - JSON Schemas for all payloads - Implementation guides - Routing configurations
CYCLE 3: Storage & Stream Domain Events (~2,500 lines)¶
Topic 5: Event Stream Lifecycle Events¶
What will be covered:
- EventStreamCreated Event
- New stream initialized
- Complete AsyncAPI specification
- Schema: StreamId, TenantId, SubjectId, StreamType, CreatedAt
- Producer: Storage Service
-
Consumers: Query Service, Integrity Service, Monitoring
-
EventAddedToStream Event
- Event appended to stream
- Schema: StreamId, EventId, SequenceNumber, PreviousHash, AddedAt
- Producer: Storage Service
-
Consumers: Integrity Service, Query Service
-
EventStreamSealed Event
- Stream finalized (no more events)
- Schema: StreamId, FinalEventCount, MerkleRoot, SealedAt, SealedBy
- Producer: Storage Service
-
Consumers: Integrity Service, Export Service, Archive Service
-
EventStreamArchived Event
- Stream moved to cold storage
- Schema: StreamId, ArchiveLocation, ArchiveFormat, ArchivedAt
Complete specifications for each event
Deliverables: - AsyncAPI specs for stream lifecycle events - JSON Schemas - Implementation guides
Topic 6: Storage and Integrity Events¶
What will be covered:
- EventPersisted Event
- Event successfully stored
-
Schema: EventId, StorageLocation, StorageFormat, PersistedAt
-
EventHashCalculated Event
- Hash computed for event
-
Schema: EventId, Hash, HashAlgorithm, CalculatedAt
-
HashChainUpdated Event
- Stream hash chain updated
-
Schema: StreamId, PreviousHash, NewHash, EventCount
-
IntegrityProofGenerated Event
- Cryptographic proof created
- Schema: StreamId, ProofType, ProofData, GeneratedAt
Complete specifications for each event
Deliverables: - AsyncAPI specs for storage and integrity events - JSON Schemas - Implementation guides
CYCLE 4: Classification & PII Events (~2,000 lines)¶
Topic 7: Classification Events¶
What will be covered:
- EventClassified Event
- Classification applied to event
- Complete specification
-
Schema: EventId, Classification, Confidence, ClassificationRules, ClassifiedAt
-
ClassificationUpdated Event
- Classification changed (upgraded)
-
Schema: EventId, OldClassification, NewClassification, Reason
-
AutoClassificationApplied Event
- AI/ML classification result
- Schema: EventId, Classification, Model, Confidence, Rules
Complete specifications
Deliverables: - AsyncAPI specs for classification events - JSON Schemas
Topic 8: PII Detection and Redaction Events¶
What will be covered:
- PIIDetected Event
- PII found in event payload
-
Schema: EventId, PIITypes, Locations, Confidence, DetectedBy
-
EventRedacted Event
- PII redacted from event
-
Schema: EventId, RedactedFields, RedactionMethod, RedactedAt
-
RedactionFailed Event
- Redaction could not be applied
- Schema: EventId, FailureReason, RequiresManualReview
Complete specifications
Deliverables: - AsyncAPI specs for PII events - JSON Schemas
CYCLE 5: Retention & Lifecycle Events (~2,000 lines)¶
Topic 9: Retention Policy Events¶
What will be covered:
- RetentionPolicyApplied Event
- Retention policy set on event
-
Schema: EventId, PolicyId, RetentionPeriod, ExpiresAt
-
RetentionPolicyUpdated Event
- Retention changed for event
-
Schema: EventId, OldPolicy, NewPolicy, UpdateReason
-
RetentionExpirationScheduled Event
- Event scheduled for deletion
- Schema: EventId, ExpiresAt, DeletionScheduledFor
Complete specifications
Topic 10: Lifecycle and Deletion Events¶
What will be covered:
- LegalHoldApplied Event
- Legal hold placed (prevents deletion)
-
Schema: EventId, HoldId, HoldReason, AppliedBy, ExpiresAt
-
LegalHoldReleased Event
- Legal hold removed
-
Schema: EventId, HoldId, ReleasedBy, ReleasedAt
-
EventTombstoned Event
- Event soft-deleted (retention expired)
-
Schema: EventId, TombstoneReason, TombstonedAt, RecoverableUntil
-
EventPermanentlyDeleted Event
- Event hard-deleted (GDPR)
- Schema: EventId, DeletionReason, DeletedBy, IrreversibleAt
Complete specifications
CYCLE 6: Policy & Tenant Events (~2,000 lines)¶
Topic 11: Policy Management Events¶
What will be covered:
- PolicyCreated Event
- PolicyUpdated Event
- PolicyDeleted Event
- PolicyRuleAdded Event
- PolicyRuleRemoved Event
Complete specifications for all policy events
Topic 12: Tenant Lifecycle Events¶
What will be covered:
- TenantOnboarded Event
- TenantConfigurationUpdated Event
- TenantSuspended Event
- TenantReactivated Event
- TenantDeleted Event
Complete specifications for all tenant events
CYCLE 7: Export & Integrity Events (~2,000 lines)¶
Topic 13: Export and eDiscovery Events¶
What will be covered:
- ExportRequested Event
- ExportStarted Event
- ExportProgressUpdated Event
- ExportCompleted Event
- ExportFailed Event
- ExportDelivered Event
Complete specifications
Topic 14: Integrity Verification Events¶
What will be covered:
- HashChainValidated Event
- HashChainValidationFailed Event
- SignatureVerified Event
- SignatureVerificationFailed Event
- IntegrityAuditCompleted Event
Complete specifications
CYCLE 8: Integration Events Catalog (~2,500 lines)¶
Topic 15: Cross-Context Integration Events¶
What will be covered:
- Integration Events vs Domain Events
- Domain events: Internal to bounded context
- Integration events: Published language across contexts
-
Translation at anti-corruption layer
-
Tenant Integration Events
- TenantOnboarded (published language for all contexts)
- TenantSuspended, TenantReactivated, TenantDeleted
-
Complete AsyncAPI specs
-
Export Integration Events
- ExportRequested (Query → Export context)
- ExportCompleted (Export → Webhook/SIEM context)
-
Complete AsyncAPI specs
-
Integrity Integration Events
- IntegrityVerified (Integrity → Query context)
- HashChainValidated (Integrity → Export context)
- Complete AsyncAPI specs
Complete specifications for all integration events
Topic 16: External System Integration Events¶
What will be covered:
- SIEMEventPublished Event
- Event forwarded to SIEM system
-
Schema: EventId, SIEMSystem, DeliveredAt
-
WebhookDelivered Event
- Webhook successfully sent
-
Schema: WebhookId, URL, ResponseStatus, DeliveredAt
-
WebhookDeliveryFailed Event
- Webhook failed after retries
-
Schema: WebhookId, URL, FailureReason, Attempts
-
ExternalSystemSyncCompleted Event
- Sync with external system finished
- Schema: SystemId, SyncedEventCount, Duration
Complete specifications
CYCLE 9: Message Envelope & Metadata (~2,000 lines)¶
Topic 17: Envelope Standards and Serialization¶
What will be covered:
- Serialization Formats
- JSON: Primary format (human-readable, debugging-friendly)
- Protocol Buffers: Alternative (compact, typed, fast)
- Apache Avro: Alternative (schema evolution)
-
Format selection per use case
-
JSON Serialization Rules
- Property naming: camelCase
- Date format: ISO 8601 with timezone
- Numbers: No leading zeros, scientific notation
- Nulls: Omit optional fields vs explicit null
-
Enums: String values (not integers)
-
Message Compression
- When to compress (> 16KB)
- Compression algorithms (gzip, brotli)
-
Compression headers
-
Message Encryption
- Payload encryption for sensitive events
- Envelope-level encryption (Azure Service Bus)
- Key management (Azure Key Vault)
- Field-level encryption for PII
Code Examples: - JSON serialization settings - Protocol Buffers message definition - Encryption/decryption code - Compression utilities
Topic 18: Message Partitioning and Ordering¶
What will be covered:
- Partitioning Strategy
- Partition key: TenantId (tenant isolation + ordering)
- Session ID: TenantId + StreamId (stream ordering)
-
Hash-based partitioning
-
Ordering Guarantees
- Within partition: Strict ordering
- Across partitions: No ordering guarantee
-
Session-based ordering in Azure Service Bus
-
Sequence Numbers
- Azure Service Bus sequence numbers
- Event sequence within stream
-
Gap detection
-
Out-of-Order Handling
- Buffering events
- Reordering strategies
- Timeout for missing events
Code Examples: - Partition key configuration - Session-based consumer - Sequence number handling - Reordering logic
CYCLE 10: Schema Versioning & Evolution (~2,000 lines)¶
Topic 19: Schema Versioning Strategy¶
What will be covered:
- Schema Version Field
schemaVersionin envelope (1.0.0, 2.1.3)- Semantic versioning for schemas
-
Version compatibility matrix
-
Breaking vs Non-Breaking Changes
- Non-Breaking: Add optional fields, relax validation
-
Breaking: Remove fields, change types, add required fields
-
Schema Evolution Patterns
- Expand-contract pattern
- Dual schema publishing
-
Schema transformation at consumer
-
Backward Compatibility
- New consumers read old messages
- Graceful degradation
-
Default values for missing fields
-
Forward Compatibility
- Old consumers ignore unknown fields
- Validation rules allow extensions
Code Examples: - Versioned message schemas - Schema compatibility tests - Migration code
Topic 20: Schema Migration Procedures¶
What will be covered:
- Adding New Event Types
- Define AsyncAPI spec
- Create JSON Schema
- Publish to schema registry
-
Implement producer and consumers
-
Modifying Existing Events
- Impact analysis (who's affected)
- Breaking vs non-breaking determination
- Migration plan
- Dual publishing (v1 and v2)
- Consumer migration timeline
-
Sunset old version
-
Deprecating Events
- Deprecation notice (6 months minimum)
- Mark as deprecated in AsyncAPI
- Monitor usage of deprecated events
- Provide migration guide
- Sunset enforcement
Code Examples: - Schema migration scripts - Dual publishing code - Deprecation annotations - Usage monitoring
CYCLE 11: Message Reliability & Patterns (~2,000 lines)¶
Topic 21: Reliable Message Delivery¶
What will be covered:
- At-Least-Once Delivery
- Guaranteed delivery semantics
- Azure Service Bus guarantees
-
Duplicate message handling
-
Idempotency Requirements
- Why idempotency is critical
- Idempotent consumer pattern
- Idempotency key strategies
-
Deduplication window
-
Retry Policies
- Exponential backoff
- Max retry attempts (3, 5, 10)
- Retry delay configuration
-
Jitter to prevent thundering herd
-
Dead-Letter Queue Handling
- When messages go to DLQ (max retries, expired, rejected)
- DLQ monitoring and alerting
- DLQ reprocessing procedures
- Poison message identification
-
Manual intervention triggers
-
Message TTL (Time-To-Live)
- Default TTL: 24 hours for events
- Extended TTL for critical events
-
Expired message handling
-
Transactional Outbox
- Atomic message publishing with DB transaction
- Outbox table schema
- Outbox processor pattern
- Exactly-once semantics (eventually)
Code Examples: - Idempotent consumer implementation - Retry policy configuration - DLQ processor - Outbox pattern implementation
Topic 22: Advanced Message Patterns¶
What will be covered:
- Publish-Subscribe Pattern
- Multiple consumers per event
- Topic fan-out
-
Subscription filters
-
Competing Consumers Pattern
- Multiple instances, one processes
- Load balancing
-
Partition assignment
-
Saga Pattern with Messages
- Distributed transaction coordination
- Compensation events
-
Saga state management
-
Event Sourcing with Messages
- Events as messages
- Event store as message log
-
Replay from messages
-
Request-Reply Pattern (Async)
- Correlation ID for replies
- Reply-to addressing
- Timeout handling
Code Examples: - Pub/sub implementation - Competing consumers - Saga coordinator - Request-reply async
CYCLE 12: Testing & Observability (~2,000 lines)¶
Topic 23: Message Contract Testing¶
What will be covered:
- Schema Validation Tests
- JSON Schema validation at producer
- JSON Schema validation at consumer
-
Invalid message rejection
-
Producer Contract Tests
- Producer publishes valid messages
- Schema compliance testing
-
Envelope compliance testing
-
Consumer Contract Tests
- Consumer handles all message versions
- Consumer handles missing optional fields
-
Consumer rejects invalid messages
-
Pact Contract Testing
- Consumer-driven contracts
- Pact broker setup
-
Pact verification on producer
-
Message Replay Testing
- Replay production messages in test
- Test idempotency with replays
-
Test ordering with shuffled messages
-
Chaos Testing
- Duplicate message injection
- Out-of-order message injection
- Message loss simulation
- Delayed message simulation
Code Examples: - Schema validation tests - Pact consumer tests - Pact provider verification - Replay test harness - Chaos testing setup
Topic 24: Message Observability¶
What will be covered:
- Distributed Tracing
- W3C Trace Context propagation
- Correlation ID in all messages
-
Trace visualization (Jaeger, Azure App Insights)
-
Message Metrics
- Publish rate per topic
- Consume rate per subscription
- Processing latency (E2E)
- Queue depth and lag
-
DLQ message count
-
Message Logging
- Structured logging with context
- Event lifecycle logs (published, consumed, failed)
-
Error logging with stack traces
-
Message Monitoring Dashboards
- Grafana dashboards for messages
- Azure Monitor workbooks
-
Real-time monitoring
-
Message Alerting
- Publish failure alerts
- Consumer lag alerts
- DLQ growth alerts
- Schema validation failure alerts
Code Examples: - OpenTelemetry instrumentation for messages - Metrics collection code - Logging configuration - Dashboard JSON (Grafana) - Alert rule definitions
Diagrams: - Distributed tracing flow - Message monitoring architecture - Alert routing - Dashboard examples
Deliverables: - Observability framework for messages - Monitoring dashboards - Alert configurations - Troubleshooting guide
Message Schema Quick Reference¶
Domain Events by Context¶
Ingestion Context (4 events): - AuditEventReceived, AuditEventValidationFailed, BatchIngestionStarted, BatchIngestionCompleted
Storage Context (8 events): - EventStreamCreated, EventAddedToStream, EventStreamSealed, EventStreamArchived, EventPersisted, EventHashCalculated, HashChainUpdated, IntegrityProofGenerated
Classification Context (6 events): - EventClassified, ClassificationUpdated, AutoClassificationApplied, PIIDetected, EventRedacted, RedactionFailed
Retention Context (7 events): - RetentionPolicyApplied, RetentionPolicyUpdated, RetentionExpirationScheduled, LegalHoldApplied, LegalHoldReleased, EventTombstoned, EventPermanentlyDeleted
Policy Context (5 events): - PolicyCreated, PolicyUpdated, PolicyDeleted, PolicyRuleAdded, PolicyRuleRemoved
Tenant Context (5 events): - TenantOnboarded, TenantConfigurationUpdated, TenantSuspended, TenantReactivated, TenantDeleted
Export Context (6 events): - ExportRequested, ExportStarted, ExportProgressUpdated, ExportCompleted, ExportFailed, ExportDelivered
Integrity Context (5 events): - HashChainValidated, HashChainValidationFailed, SignatureVerified, SignatureVerificationFailed, IntegrityAuditCompleted
Total: 50+ domain events
Integration Events¶
Cross-Context (10+ events): - TenantOnboarded (Policy → All) - ExportRequested (Export → Storage) - IntegrityVerified (Integrity → Query) - SIEMEventPublished (Export → External) - WebhookDelivered (Webhook → External)
Summary & Implementation Plan¶
Implementation Phases¶
Phase 1: Foundations (Cycles 1-2) - 2 weeks - AsyncAPI fundamentals - Ingestion events
Phase 2: Core Events (Cycles 3-5) - 3 weeks - Storage, classification, retention events
Phase 3: Contexts (Cycles 6-7) - 2 weeks - Policy, tenant, export, integrity events
Phase 4: Integration (Cycle 8) - 1 week - Integration events catalog
Phase 5: Advanced (Cycles 9-10) - 2 weeks - Envelope, versioning, evolution
Phase 6: Operations (Cycles 11-12) - 2 weeks - Reliability, testing, observability
Success Metrics¶
- Event Coverage: 50+ domain events, 10+ integration events documented
- Schema Validation: 100% of messages have JSON Schema
- AsyncAPI Complete: Valid AsyncAPI 2.6 specification
- Contract Tests: 100% of schemas have automated tests
- Documentation: All events with examples and usage guides
Ownership & Maintenance¶
- Backend Developers: All cycles (event implementation)
- Architects: Cycles 1, 8-10 (architecture, integration, versioning)
- QA Engineers: Cycles 11-12 (testing, validation)
- Operations: Cycle 12 (observability, monitoring)
Document Status: ✅ Plan Approved - Ready for Content Generation
Target Start Date: Q2 2025
Expected Completion: Q3 2025 (3 months)
Owner: Backend Engineering Team
Last Updated: 2024-10-30