Idempotency Patterns - Audit Trail Platform (ATP)¶
Safe to retry — ATP's idempotency patterns ensure operations can be safely retried without unintended side effects, providing exactly-once semantics in a distributed, at-least-once delivery world.
📋 Documentation Generation Plan¶
This document will be generated in 5 cycles. Current progress:
| Cycle | Topics | Estimated Lines | Status |
|---|---|---|---|
| Cycle 1 | Idempotency Fundamentals & Principles (1-2) | ~2,500 | ⏳ Not Started |
| Cycle 2 | REST API Idempotency (3-4) | ~3,000 | ⏳ Not Started |
| Cycle 3 | Message Idempotency (5-6) | ~2,500 | ⏳ Not Started |
| Cycle 4 | Webhook & External System Idempotency (7-8) | ~2,000 | ⏳ Not Started |
| Cycle 5 | Implementation & Testing (9-10) | ~2,500 | ⏳ Not Started |
Total Estimated Lines: ~12,500
Purpose & Scope¶
This document defines comprehensive idempotency patterns for the Audit Trail Platform (ATP), establishing techniques, implementations, and best practices for ensuring exactly-once semantics across REST APIs, async messaging, webhooks, and external system integrations despite at-least-once delivery guarantees and network failures.
Key Idempotency Principles - Safe to Retry: All operations can be safely retried without side effects - Exactly-Once Semantics: Despite at-least-once delivery, operations execute only once - Idempotency Keys: Unique identifiers for deduplication (client-generated or server-assigned) - Idempotency Window: Time period for duplicate detection (24 hours default) - Result Caching: Original result stored and returned for duplicates - Distributed Consistency: Idempotency works across multiple service instances - Performance: Minimal overhead for idempotency checks (<10ms)
What this document covers
- Establish idempotency fundamentals: What it is, why critical for ATP, mathematical definition
- Define REST API idempotency: Idempotency-Key header, duplicate detection, response caching
- Specify message idempotency: Event deduplication, MessageId usage, idempotent handlers
- Document webhook idempotency: Webhook ID deduplication, replay protection
- Detail idempotency implementation: Storage strategies (cache, database), key generation, TTL
- Describe idempotency patterns: Idempotent operations, non-idempotent to idempotent conversion
- Outline distributed idempotency: Cross-service coordination, distributed caching, consistency
- Specify idempotency testing: Replay tests, duplicate injection, race condition testing
- Document idempotency monitoring: Duplicate detection rate, cache hit rate, performance metrics
- Detail idempotency in event sourcing: Natural idempotency, event replay, projection idempotency
- Describe idempotency edge cases: Clock skew, cache invalidation, partial failures
- Outline idempotency best practices: Key generation, storage, TTL, error handling
Out of scope (referenced elsewhere)
- REST API specifications (see ../rest-apis.md)
- Message schemas (see ../message-schemas.md)
- Webhook specifications (see ../webhooks.md)
- Outbox/Inbox patterns (see ../../../implementation/outbox-inbox-idempotency.md)
- Domain model (see ../../aggregates-entities.md)
Readers & ownership
- Backend Developers (owners): Idempotency implementation, duplicate detection, caching
- Architects: Distributed idempotency design, consistency strategies
- Integration Engineers: Idempotency in external integrations, key management
- QA/Test Engineers: Idempotency testing, replay testing, race condition tests
- Operations/SRE: Idempotency monitoring, cache management, performance tuning
- Security Engineers: Idempotency key security, replay attack prevention
Artifacts produced
- Idempotency Pattern Library: Catalog of all idempotency patterns used in ATP
- Implementation Templates: Code templates for idempotent operations in C#
- REST API Idempotency Spec: Idempotency-Key header standard and usage
- Message Idempotency Spec: Event deduplication with MessageId
- Webhook Idempotency Spec: Webhook ID deduplication and replay protection
- Storage Strategies: Redis cache, database table, hybrid approaches
- Key Generation Guide: ULID, UUID, custom key formats
- Testing Framework: Replay tests, duplicate injection, race condition tests
- Monitoring Dashboards: Duplicate rate, cache performance, idempotency effectiveness
- Troubleshooting Guide: Common idempotency issues and debugging
- Best Practices Catalog: Do's and don'ts for idempotent design
- Migration Guide: Converting non-idempotent to idempotent operations
Acceptance (done when)
- Idempotency fundamentals are explained with clear definitions and examples
- REST API idempotency is specified with Idempotency-Key header and implementation
- Message idempotency is documented with event deduplication strategies
- Webhook idempotency is specified with webhook ID deduplication
- Implementation patterns are documented with code examples in C#
- Storage strategies are compared (Redis, database, hybrid) with recommendations
- Key generation is standardized (ULID for ATP)
- TTL policies are defined (24 hours default, configurable)
- Testing framework includes replay tests and duplicate injection
- Monitoring is operational with metrics and dashboards
- Troubleshooting guide covers common issues
- Best practices are documented with anti-patterns to avoid
- Documentation complete with comprehensive examples, diagrams, and code
Detailed Cycle Plan¶
CYCLE 1: Idempotency Fundamentals & Principles (~2,500 lines)¶
Topic 1: Idempotency Fundamentals¶
What will be covered:
- What is Idempotency?
- Mathematical Definition: An operation is idempotent if applying it multiple times has the same effect as applying it once
- f(f(x)) = f(x) for all x
- Software Definition: An idempotent operation produces the same result when called multiple times with the same input
-
Examples:
- Idempotent: SET x = 5 (always results in x = 5)
- Idempotent: DELETE user WHERE id = 123 (delete once, delete again = already deleted)
- NOT Idempotent: x = x + 1 (increments each time)
- NOT Idempotent: INSERT INTO users VALUES (...) (creates duplicate)
-
Why Idempotency Matters
- Network Failures: Client doesn't know if request succeeded (timeout, connection lost)
- At-Least-Once Delivery: Message systems guarantee delivery but may duplicate
- Retry Safety: Safe to retry failed operations without side effects
- Distributed Systems: Eventual consistency requires replay safety
-
User Experience: Prevents duplicate charges, duplicate records, inconsistent state
-
Idempotency in Distributed Systems
- Network partitions cause request duplicates
- Clients retry on timeout (can't distinguish failure from slow response)
- Load balancers may duplicate requests
- Message brokers deliver at-least-once
-
Idempotency enables safe retries
-
ATP's Need for Idempotency
- Compliance: Duplicate audit events violate integrity
- Data Quality: No duplicate records in tamper-evident store
- Operational Safety: Safe to replay messages during recovery
- External Integrations: Partners may retry requests
- Event Sourcing: Safe event replay for projections
-
Multi-Tenant: Tenant isolation requires deterministic operations
-
Idempotency vs Related Concepts
| Concept | Definition | Relationship to Idempotency |
|---|---|---|
| Idempotency | Same result on multiple calls | Core concept |
| Determinism | Same input → same output | Prerequisite for idempotency |
| Commutativity | Order doesn't matter (a+b = b+a) | Related but different |
| Immutability | Cannot be changed after creation | Enables natural idempotency |
| Deduplication | Remove duplicates | Technique to achieve idempotency |
| Exactly-Once | Delivered and processed once | Goal achieved via idempotency |
- Naturally Idempotent Operations
- HTTP GET: Reading data (no side effects)
- HTTP PUT: Replace entire resource (same result each time)
- HTTP DELETE: Remove resource (already deleted = idempotent)
- SET operations: Set value (absolute, not relative)
-
Upsert: Insert or update (same final state)
-
Non-Idempotent Operations (Require Special Handling)
- HTTP POST: Create new resource (duplicates possible)
- Increment/Decrement: Relative changes (x = x + 1)
- Append to list: Adds each time
- Generate ID: Different each time
- Current timestamp: Changes each call
- Random values: Different each time
Code Examples: - Idempotent vs non-idempotent code examples - Mathematical examples - Natural idempotency examples - Converting non-idempotent to idempotent
Diagrams: - Idempotency concept visualization - Network failure scenario (retry needed) - At-least-once delivery problem - Idempotency solution
Deliverables: - Idempotency fundamentals guide - Problem scenarios documentation - ATP rationale for idempotency - Naturally idempotent operations catalog
Topic 2: Idempotency Patterns and Strategies¶
What will be covered:
- Idempotency Key Pattern
- Client generates unique key for each operation
- Server stores key with result
- Duplicate requests with same key return cached result
-
Key expiration after window (24 hours)
-
Idempotency Key Generation
- Client-Generated: Client creates UUID or ULID
- Server-Generated: Server creates from request hash
- Hybrid: Client provides, server validates or generates
-
ATP approach: Client-generated ULID (recommended)
-
Idempotency Storage Strategies
1. In-Memory Cache (Redis) - Pros: Fast (<1ms), high throughput - Cons: Cache eviction, not durable - Use Case: Short TTL (minutes), high performance - ATP Usage: Session-based operations
2. Database Table - Pros: Durable, queryable, long retention - Cons: Slower (10-50ms), database load - Use Case: Long TTL (hours/days), compliance - ATP Usage: Critical operations (ingestion, exports)
3. Hybrid (Cache + Database) - Pros: Fast reads from cache, durable in DB - Cons: Complexity, consistency challenges - Use Case: Best of both worlds - ATP Usage: Ingestion API (cache first, DB fallback)
- Idempotency Window (TTL)
- Definition: How long to remember processed keys
- ATP Default: 24 hours
- Rationale: Balance between safety and storage
- Configurable: Per operation type (ingestion: 24h, exports: 7 days)
-
Cleanup: Expired keys removed automatically
-
Result Caching
- Store original result with idempotency key
- Return cached result for duplicate requests
- Cache TTL matches idempotency window
-
Cache size considerations
-
Idempotency for State Transitions
- State machine with idempotent transitions
- Transition validation (can only transition once)
- Duplicate transition requests ignored
-
Example: EventStream Seal operation (idempotent)
-
Idempotency for Side Effects
- External API calls (payment, email, webhooks)
- Idempotency token for external calls
- Track external operation results
- Don't repeat side effects on retry
Code Examples: - Idempotency key generation (ULID, UUID) - Redis cache implementation - Database table schema and queries - Hybrid cache+DB implementation - Result caching code - State machine with idempotency
Diagrams: - Idempotency key pattern flow - Storage strategy comparison - Hybrid cache+DB architecture - TTL and cleanup process - State transition idempotency
Deliverables: - Idempotency patterns catalog - Storage strategy guide - Key generation standards - TTL policy specifications - Implementation templates
CYCLE 2: REST API Idempotency (~3,000 lines)¶
Topic 3: Idempotency-Key Header Standard¶
What will be covered:
- Idempotency-Key Header Specification
- Header Name:
Idempotency-Key(standard) - Format: ULID (26 characters, time-ordered) or UUID (36 characters)
- Required: For all write operations (POST, PUT, PATCH, DELETE)
- Optional: For read operations (GET) - not needed
- Scope: Per tenant (keys unique within tenant)
-
Lifetime: 24 hours (configurable per endpoint)
-
HTTP Header Example
POST /api/v1/audit-events HTTP/1.1 Host: api.atp.connectsoft.example Authorization: Bearer eyJ0eXAi... Content-Type: application/json X-Tenant-Id: tenant-abc-123 Idempotency-Key: 01HQZXYZ123456789ABCDEF X-Correlation-Id: corr-abc-123 { "timestamp": "2024-10-30T10:30:00Z", "actor": { ... }, "action": "User.Login", "resource": { ... } } -
Server-Side Idempotency Check
public async Task<IActionResult> IngestAuditEvent( [FromBody] IngestAuditEventRequest request, [FromHeader(Name = "Idempotency-Key")] string idempotencyKey) { // 1. Validate idempotency key if (string.IsNullOrEmpty(idempotencyKey)) { return BadRequest(new ErrorResponse { Code = "IDEMPOTENCY_KEY_REQUIRED", Message = "Idempotency-Key header is required" }); } // 2. Check if request already processed var cachedResult = await _idempotencyService .GetCachedResultAsync(idempotencyKey); if (cachedResult != null) { // 3. Return cached result (duplicate request) return Ok(cachedResult with { Idempotent = true }); } // 4. Process new request var result = await _ingestionService.IngestAsync(request); // 5. Cache result for future duplicates await _idempotencyService.CacheResultAsync( idempotencyKey, result, ttl: TimeSpan.FromHours(24)); // 6. Return result (first time) return Created($"/api/v1/audit-events/{result.EventId}", result with { Idempotent = false }); } -
Idempotent vs Non-Idempotent Response
- First Request (200/201):
{ ..., "idempotent": false } - Duplicate Request (200):
{ ..., "idempotent": true } -
Client can detect if request was duplicate
-
Idempotency Key Validation
- Format validation (ULID or UUID)
- Length validation (26 or 36 characters)
- Character set validation (base32 for ULID)
-
Uniqueness check (not used in past 24 hours)
-
Idempotency Errors
- 400 Bad Request: Missing or invalid idempotency key
- 409 Conflict: Key used with different request payload
-
422 Unprocessable Entity: Key format invalid
-
Idempotency for Different HTTP Methods
| HTTP Method | Naturally Idempotent? | Idempotency-Key Required? |
|---|---|---|
| GET | ✅ Yes (read-only) | ❌ No |
| POST | ❌ No (creates resource) | ✅ Yes |
| PUT | ✅ Yes (replaces resource) | ⚠️ Recommended |
| PATCH | ⚠️ Depends (can be idempotent) | ⚠️ Recommended |
| DELETE | ✅ Yes (delete once, already deleted) | ⚠️ Recommended |
- ATP Idempotency Requirements
- POST /audit-events: Idempotency-Key REQUIRED
- POST /audit-events/batch: Idempotency-Key REQUIRED
- POST /exports: Idempotency-Key REQUIRED
- POST /policies: Idempotency-Key RECOMMENDED
- PUT /policies/{id}: Idempotency-Key RECOMMENDED (with ETag)
- DELETE /policies/{id}: Naturally idempotent (key optional)
Code Examples: - Complete idempotency middleware (C# ASP.NET Core) - Idempotency service implementation - Redis cache integration - Database storage implementation - Validation logic - Error responses
Diagrams: - Idempotency check flow - First vs duplicate request - Cache and database strategy - Error handling flow
Deliverables: - Idempotency-Key header specification - REST API idempotency implementation - Validation and error handling - Storage strategy guide
Topic 4: Idempotency Storage and Caching¶
What will be covered:
-
Redis Cache Implementation
public class RedisIdempotencyService : IIdempotencyService { private readonly IConnectionMultiplexer _redis; private readonly IDatabase _db; public async Task<T?> GetCachedResultAsync<T>(string key) { var value = await _db.StringGetAsync($"idempotency:{key}"); return value.HasValue ? JsonSerializer.Deserialize<T>(value!) : default; } public async Task CacheResultAsync<T>( string key, T result, TimeSpan ttl) { var json = JsonSerializer.Serialize(result); await _db.StringSetAsync( $"idempotency:{key}", json, ttl, When.NotExists); // Only set if not exists (race protection) } public async Task<bool> IsProcessedAsync(string key) { return await _db.KeyExistsAsync($"idempotency:{key}"); } } -
Database Table Implementation
CREATE TABLE IdempotencyRecords ( IdempotencyKey NVARCHAR(50) PRIMARY KEY, TenantId NVARCHAR(50) NOT NULL, RequestHash NVARCHAR(64) NOT NULL, -- SHA256 of request body ResponseStatus INT NOT NULL, ResponseBody NVARCHAR(MAX), ProcessedAt DATETIME2 NOT NULL, ExpiresAt DATETIME2 NOT NULL, INDEX IX_IdempotencyRecords_ExpiresAt (ExpiresAt), INDEX IX_IdempotencyRecords_TenantId_Key (TenantId, IdempotencyKey) );
public class DatabaseIdempotencyService : IIdempotencyService
{
private readonly DbContext _context;
public async Task<IdempotencyRecord?> GetRecordAsync(string key)
{
return await _context.IdempotencyRecords
.Where(r => r.IdempotencyKey == key)
.Where(r => r.ExpiresAt > DateTime.UtcNow)
.FirstOrDefaultAsync();
}
public async Task SaveRecordAsync(
string key,
string tenantId,
string requestHash,
object result,
TimeSpan ttl)
{
var record = new IdempotencyRecord
{
IdempotencyKey = key,
TenantId = tenantId,
RequestHash = requestHash,
ResponseBody = JsonSerializer.Serialize(result),
ProcessedAt = DateTime.UtcNow,
ExpiresAt = DateTime.UtcNow.Add(ttl)
};
_context.IdempotencyRecords.Add(record);
await _context.SaveChangesAsync();
}
}
-
Hybrid Strategy (Cache + Database)
public class HybridIdempotencyService : IIdempotencyService { private readonly RedisIdempotencyService _cache; private readonly DatabaseIdempotencyService _database; public async Task<T?> GetCachedResultAsync<T>(string key) { // 1. Try cache first (fast path) var cached = await _cache.GetCachedResultAsync<T>(key); if (cached != null) return cached; // 2. Fallback to database (slower but durable) var dbRecord = await _database.GetRecordAsync(key); if (dbRecord == null) return default; var result = JsonSerializer.Deserialize<T>(dbRecord.ResponseBody); // 3. Warm cache for future requests await _cache.CacheResultAsync(key, result, TimeSpan.FromHours(1)); return result; } } -
Request Hash for Payload Validation
- Ensure duplicate requests have identical payload
- Hash request body (SHA256)
- Store hash with idempotency key
- Validate hash on duplicate request
-
Error if hash mismatch (409 Conflict)
-
Cleanup and Expiration
- Redis: Automatic expiration with TTL
- Database: Background job removes expired records
- Cleanup frequency: Hourly
- Retention after expiration: 0 days (immediate deletion)
Code Examples: - Complete Redis implementation - Complete database implementation - Hybrid implementation - Request hash calculation - Cleanup job code
Diagrams: - Redis architecture - Database schema - Hybrid strategy flow - Cleanup process
Deliverables: - Redis implementation guide - Database implementation guide - Hybrid strategy specification - Cleanup automation
CYCLE 3: Message Idempotency (~2,500 lines)¶
Topic 5: Event Deduplication¶
What will be covered:
- Message Idempotency in Event-Driven Systems
- At-least-once delivery guarantees duplicates
- Consumers must be idempotent
- Message ID for deduplication
-
Idempotent handlers pattern
-
Message ID as Idempotency Key
- Azure Service Bus: MessageId property
- Domain events: EventId in payload
- Integration events: EventId in envelope
-
Globally unique identifiers (ULID)
-
Idempotent Message Handler Pattern
public class IdempotentMessageHandler<TMessage> : IMessageHandler<TMessage> { private readonly IIdempotencyService _idempotency; private readonly IMessageHandler<TMessage> _innerHandler; public async Task HandleAsync(TMessage message, MessageContext context) { var messageId = context.MessageId; // 1. Check if already processed if (await _idempotency.IsProcessedAsync(messageId)) { _logger.LogInformation( "Duplicate message {MessageId} detected, skipping", messageId); return; // Skip processing (idempotent) } // 2. Process message await _innerHandler.HandleAsync(message, context); // 3. Mark as processed await _idempotency.MarkProcessedAsync( messageId, ttl: TimeSpan.FromHours(24)); } } -
Deduplication Strategies
1. Check-Then-Act (Simple) - Check if processed, then act - Race condition possible (two instances check simultaneously) - Mitigation: Database unique constraint
2. Try-Catch-Ignore (Optimistic) - Try to process (insert unique key) - Catch duplicate key exception - Ignore exception (already processed) - Fast for first attempt, slower for duplicates
3. Distributed Lock (Pessimistic) - Acquire distributed lock on message ID - Process while holding lock - Release lock after processing - Slower but prevents race conditions
-
ATP Approach: Try-Catch-Ignore (database unique constraint)
-
Deduplication Window
- 24 hours for domain events
- 7 days for integration events
- 30 days for critical operations (exports, tenant onboarding)
-
Configurable per message type
-
Azure Service Bus Duplicate Detection
- Built-in duplicate detection (limited to 10 minutes)
- Duplicate detection window: Max 7 days
- Requires sessions or MessageId
- ATP uses application-level deduplication (longer window)
Code Examples: - Idempotent handler decorator - Check-then-act implementation - Try-catch-ignore implementation - Distributed lock implementation (Redis) - Azure Service Bus duplicate detection config
Diagrams: - Message deduplication flow - Handler patterns comparison - Race condition scenarios - Distributed lock pattern
Deliverables: - Message idempotency specification - Handler pattern implementations - Deduplication strategies guide - Configuration templates
Topic 6: Idempotent Event Handlers¶
What will be covered:
- Idempotent Handler Design
- Handler checks for previous processing
- Handler operations are naturally idempotent (SET, UPSERT)
- Handler stores processing record
-
Handler is safe to replay
-
Projection Idempotency
- Projections must be idempotent (replayed often)
- Use UPSERT (INSERT OR UPDATE) not INSERT
- Use absolute values not relative (SET x = 5, not x = x + 1)
-
Track last processed event sequence number
-
Saga Idempotency
- Saga steps are idempotent
- Compensation actions are idempotent
-
Saga state machine prevents duplicate transitions
-
Outbox Pattern Idempotency
- Outbox processor is idempotent
- Events published at-most-once from outbox
-
Message ID from outbox record
-
Inbox Pattern Idempotency
- Inbox deduplicates incoming messages
- Inbox + handler in same transaction
- Message processed exactly-once
Code Examples: - Idempotent projection handler - UPSERT query (SQL) - Saga with idempotent steps - Outbox processor with idempotency - Inbox pattern implementation
Diagrams: - Idempotent handler pattern - Projection idempotency - Saga idempotency - Outbox/Inbox idempotency
Deliverables: - Handler design patterns - Projection idempotency guide - Saga implementation - Outbox/Inbox integration
CYCLE 4: Webhook & External System Idempotency (~2,000 lines)¶
Topic 7: Webhook Idempotency¶
What will be covered:
- Webhook Delivery Idempotency
- Webhooks delivered at-least-once
- Webhook ID as idempotency key
- Consumer must deduplicate
-
ATP retries may cause duplicates
-
Webhook ID Deduplication (Consumer Side)
[HttpPost("/atp/webhooks")] public async Task<IActionResult> ReceiveWebhook( [FromBody] WebhookPayload payload, [FromHeader(Name = "X-ATP-Webhook-Id")] string webhookId) { // 1. Verify signature first if (!VerifySignature(payload, Request.Headers["X-ATP-Signature"])) { return Unauthorized(); } // 2. Check if webhook already processed if (await _webhookRepo.IsProcessedAsync(webhookId)) { _logger.LogInformation( "Duplicate webhook {WebhookId} detected", webhookId); return Ok(); // Return success (idempotent) } // 3. Process webhook await ProcessWebhookAsync(payload); // 4. Mark as processed await _webhookRepo.MarkProcessedAsync(webhookId); return Ok(); } -
Webhook Replay Protection
- Timestamp validation (reject > 5 minutes old)
- Webhook ID deduplication
- Signature verification
-
Combined protection against replays
-
Consumer Idempotency Storage
- Store processed webhook IDs
- TTL: 48 hours (2x webhook retry window)
-
Cleanup expired IDs
-
Webhook Retry Idempotency
- Same webhook ID across retries
- Consumer deduplicates by ID
- First successful response stops retries
Code Examples: - Webhook receiver with idempotency - Webhook ID deduplication - Replay protection - Storage for processed webhooks
Diagrams: - Webhook idempotency flow - Deduplication mechanism - Replay protection layers
Deliverables: - Webhook idempotency guide - Consumer implementation patterns - Replay protection specification
Topic 8: External System Idempotency¶
What will be covered:
- Idempotency for External API Calls
- ATP calls external APIs (SIEM, webhooks, storage)
- External APIs may not be idempotent
- ATP must prevent duplicate external calls
-
Idempotency token for external requests
-
Idempotent Side Effects
- Emails: Don't send duplicate emails
- Payments: Don't charge twice
- Webhooks: Don't notify twice
-
Storage: Don't upload duplicate files
-
Idempotency Token for External Calls
public async Task SendToSIEM(AuditEvent evt) { var idempotencyToken = $"siem-{evt.EventId}"; // Check if already sent if (await _idempotency.IsProcessedAsync(idempotencyToken)) { return; // Skip (already sent) } // Call external SIEM API with idempotency header var response = await _httpClient.PostAsync( "https://siem.example.com/events", new JsonContent(evt), headers: new { ["Idempotency-Key"] = idempotencyToken }); // Mark as sent (regardless of response, to prevent retry storm) await _idempotency.MarkProcessedAsync(idempotencyToken); } -
External Idempotency Patterns
- Pass-through: Forward ATP idempotency key to external system
- Generate: Create idempotency token for external call
- Track: Store external call results
- Retry: Only retry if not yet sent
Code Examples: - External API call with idempotency - Side effect tracking - Idempotency token generation
Diagrams: - External call idempotency - Side effect tracking - Retry decision tree
Deliverables: - External idempotency patterns - Side effect handling guide - Implementation examples
CYCLE 5: Implementation & Testing (~2,500 lines)¶
Topic 9: Idempotency Implementation Patterns¶
What will be covered:
- Making Operations Idempotent
Non-Idempotent to Idempotent Conversion:
| Non-Idempotent | Idempotent Alternative |
|---|---|
x = x + 1 |
x = 5 (absolute) |
INSERT INTO users |
INSERT ... ON CONFLICT DO NOTHING (upsert) |
append(item) |
add_if_not_exists(item) (check first) |
generateId() |
Use provided ID or deterministic generation |
getCurrentTime() |
Use timestamp from request |
random() |
Use seed or provided value |
-
Idempotent Database Operations
-- Non-Idempotent (creates duplicate) INSERT INTO audit_events (event_id, tenant_id, ...) VALUES ('event-123', 'tenant-abc', ...); -- Idempotent (PostgreSQL) INSERT INTO audit_events (event_id, tenant_id, ...) VALUES ('event-123', 'tenant-abc', ...) ON CONFLICT (event_id) DO NOTHING; -- Idempotent (SQL Server) IF NOT EXISTS (SELECT 1 FROM audit_events WHERE event_id = 'event-123') BEGIN INSERT INTO audit_events (event_id, tenant_id, ...) VALUES ('event-123', 'tenant-abc', ...); END -- Idempotent (Update) UPDATE audit_events SET status = 'Sealed', sealed_at = '2024-10-30T10:30:00Z' WHERE event_id = 'event-123'; -- Already sealed? No change = idempotent -
Idempotent State Machines
- State transitions are idempotent
- Transition validation prevents invalid state changes
- Duplicate transition requests ignored
public class EventStream
{
public EventStreamStatus Status { get; private set; }
public void Seal()
{
// Idempotent: Can call multiple times safely
if (Status == EventStreamStatus.Sealed)
{
// Already sealed, no-op (idempotent)
return;
}
if (Status != EventStreamStatus.Open)
{
throw new InvalidOperationException(
$"Cannot seal stream in status {Status}");
}
Status = EventStreamStatus.Sealed;
SealedAt = DateTime.UtcNow;
// Publish event
AddDomainEvent(new EventStreamSealed(Id, SealedAt));
}
}
- Idempotent API Composition
- Composing multiple idempotent operations
- Transaction boundaries
- Partial failure handling
-
Compensation for non-idempotent side effects
-
Idempotency in Event Sourcing
- Events naturally idempotent (append-only)
- Event ID ensures no duplicates
- Projections must be idempotent (replay)
- Snapshots include last processed event
Code Examples: - Idempotent database operations (SQL) - State machine with idempotency (C#) - API composition examples - Event sourcing idempotency
Diagrams: - Non-idempotent to idempotent conversion - State machine transitions - Event sourcing replay
Deliverables: - Implementation pattern library - Conversion guide - Database query templates - State machine patterns
Topic 10: Idempotency Testing and Monitoring¶
What will be covered:
- Idempotency Testing Strategies
1. Replay Tests - Send same request twice - Verify second response is identical or cached - Verify no duplicate side effects
2. Concurrent Request Tests - Send same request from multiple threads - Verify only one execution - Verify all requests get same result
3. Race Condition Tests - Simulate simultaneous duplicate requests - Verify database constraints prevent duplicates - Verify distributed locks work correctly
4. Cache Failure Tests - Simulate cache unavailable - Verify fallback to database - Verify eventual consistency
5. Expiration Tests - Wait for idempotency window expiration - Verify new execution after expiration - Verify cleanup of expired records
-
Idempotency Test Framework
[Fact] public async Task IngestAuditEvent_WithSameIdempotencyKey_ReturnsIdempotentResponse() { // Arrange var request = new IngestAuditEventRequest { ... }; var idempotencyKey = Ulid.NewUlid().ToString(); // Act: First request var response1 = await _client.PostAsync( "/api/v1/audit-events", request, headers: new { ["Idempotency-Key"] = idempotencyKey }); // Act: Duplicate request var response2 = await _client.PostAsync( "/api/v1/audit-events", request, headers: new { ["Idempotency-Key"] = idempotencyKey }); // Assert response1.StatusCode.Should().Be(HttpStatusCode.Created); response2.StatusCode.Should().Be(HttpStatusCode.OK); var result1 = await response1.Content.ReadAsAsync<IngestResponse>(); var result2 = await response2.Content.ReadAsAsync<IngestResponse>(); result1.EventId.Should().Be(result2.EventId); result1.Idempotent.Should().BeFalse(); result2.Idempotent.Should().BeTrue(); } -
Idempotency Monitoring Metrics
- Duplicate Detection Rate: % of requests that are duplicates
- Cache Hit Rate: % of idempotency checks served from cache
- Idempotency Check Latency: P50, P95, P99 latency
- Cache Miss Latency: Database fallback latency
-
Cleanup Job Duration: Time to clean expired records
-
Idempotency Alerts
- High duplicate rate (> 10% suggests client issues)
- Low cache hit rate (< 90% suggests cache issues)
- High check latency (> 50ms suggests performance issues)
-
Cleanup job failures
-
Troubleshooting Idempotency Issues
- Missing idempotency key (400 error)
- Expired idempotency key (new execution)
- Conflicting payload (409 error)
- Cache unavailable (slower responses)
- Database contention (high latency)
Code Examples: - Complete test suite for idempotency - Metrics collection code - Monitoring queries (Prometheus, Azure Monitor) - Alert rules - Troubleshooting scripts
Diagrams: - Test strategy diagram - Monitoring dashboard mockup - Alert decision tree - Troubleshooting flowchart
Deliverables: - Idempotency test framework - Test suite with 10+ test cases - Monitoring setup guide - Alert configurations - Troubleshooting guide
Idempotency Quick Reference¶
Idempotency-Key Header¶
- Format: ULID (26 chars) or UUID (36 chars)
- Required: POST, PUT, PATCH, DELETE
- Optional: GET (not needed)
- Scope: Per tenant
- TTL: 24 hours
Idempotency Response Indicator¶
Storage Comparison¶
| Strategy | Latency | Durability | Scalability | ATP Use Case |
|---|---|---|---|---|
| Redis | <1ms | Low | High | Session operations |
| Database | 10-50ms | High | Medium | Critical operations |
| Hybrid | 1-50ms | High | High | Ingestion API ✅ |
Idempotent Operations¶
✅ Naturally Idempotent: - GET (read-only) - PUT (replace entire resource) - DELETE (already deleted = ok) - SET x = value (absolute)
❌ Requires Idempotency Pattern: - POST (create new resource) - Increment/decrement (x = x + 1) - Append (add to collection) - Generate ID or timestamp
Summary & Implementation Plan¶
Implementation Phases¶
Phase 1: Foundations (Cycle 1) - 1 week - Fundamentals and patterns
Phase 2: REST APIs (Cycle 2) - 1.5 weeks - Idempotency-Key header and storage
Phase 3: Messaging (Cycle 3) - 1.5 weeks - Event deduplication and handlers
Phase 4: External (Cycle 4) - 1 week - Webhooks and external systems
Phase 5: Quality (Cycle 5) - 1.5 weeks - Implementation patterns, testing, monitoring
Success Metrics¶
- Duplicate Prevention: 100% duplicates detected
- Performance: Idempotency check <10ms P95
- Cache Hit Rate: >90% served from cache
- Test Coverage: 100% idempotency tests pass
- Documentation: All patterns documented
Ownership & Maintenance¶
- Backend Developers: All cycles (implementation)
- Architects: Cycles 1-2 (design patterns)
- QA Engineers: Cycle 5 (testing)
- Operations: Monitoring and troubleshooting
Document Status: ✅ Plan Approved - Ready for Content Generation
Target Start Date: Q3 2025
Expected Completion: Q3 2025 (6.5 weeks)
Owner: Backend Engineering Team
Last Updated: 2024-10-30