Modern subscription businesses don't operate through a single system. Revenue-critical data flows through billing platforms, CRM systems, customer engagement tools, product analytics, and financial reporting systems. Keeping all of these systems synchronized — in real time, at scale, with guaranteed consistency — is one of the hardest infrastructure problems in subscription commerce.
The Orchestration Challenge
Consider a seemingly simple scenario: a customer upgrades their plan through a self-service portal. This single action triggers a cascade of downstream events that must be coordinated across systems:
- The billing system must update the subscription, calculate proration, and generate a new invoice.
- The CRM must reflect the new plan tier, update the account's ARR value, and potentially reassign the account to a different success tier.
- The product must provision new features and update usage limits.
- The engagement platform must trigger a welcome-to-your-new-plan email sequence.
- The revenue recognition system must adjust deferred revenue schedules.
- The analytics platform must record the expansion event for cohort analysis.
If any of these steps fails or executes out of order, the customer experience degrades and data integrity suffers.
Our Architecture: Event-Driven Orchestration
PeakCommerce's Revenue Engine uses an event-driven orchestration architecture built on three core principles:
Transactional Outbox Pattern
Every revenue-critical action is first committed to our transactional outbox within the same database transaction as the primary operation. This guarantees that no event is lost, even if downstream systems are temporarily unavailable. The outbox processor then reliably delivers events to our event bus with at-least-once delivery semantics.
Saga-Based Workflow Coordination
Complex, multi-step revenue workflows are modeled as sagas — long-running transactions with compensating actions. When a subscription upgrade saga begins, the orchestrator coordinates each step sequentially or in parallel as appropriate. If the billing update succeeds but the CRM sync fails, the orchestrator retries the CRM step with exponential backoff before escalating. Compensating actions ensure the system can roll back to a consistent state if a critical step permanently fails.
Idempotent Event Processing
Every downstream event consumer is designed to be idempotent. Events carry a unique idempotency key, and consumers track processed keys to prevent duplicate processing. This means our orchestrator can safely retry failed deliveries without risking duplicate charges, duplicate CRM updates, or duplicate emails.
Performance at Scale
Our orchestration layer processes over 2 million revenue events per day across our customer base, with a median end-to-end latency of 230 milliseconds from the initial action to all downstream systems being updated. The 99th percentile latency is under 2 seconds, even for complex multi-system workflows.
Lessons Learned
Building this system taught us several important lessons. First, eventual consistency is acceptable for most downstream systems, but billing and provisioning must be strongly consistent. Second, comprehensive observability — distributed tracing across every saga step — is essential for debugging production issues. Third, designing for failure from the start is far cheaper than retrofitting reliability into an existing system.
