Contract Testing for Event-Driven Systems: Catching Breaking Changes Before Production
Learn how to test event contracts between producers and consumers so schema drift, semantic changes, and versioning mistakes are caught before deployment.
Introduction
Event-driven systems are excellent at decoupling services, but that decoupling creates a quiet testing problem: the producer and consumer rarely fail in the same process. A producer can deploy a seemingly harmless event change on Monday, while a consumer discovers the break on Tuesday after a queue has filled with messages it no longer understands.
Traditional unit tests do not catch that class of failure because each service tests only its local behavior. End-to-end tests can catch it, but they are usually slower, more fragile, and too broad to tell you which contract changed. Contract testing fills the gap by turning the event shape, meaning, and compatibility rules into something both sides can verify independently.
This article walks through a practical way to design and test event contracts without needing a heavy platform. The examples use TypeScript and plain test-friendly patterns, but the same ideas apply to Kafka, RabbitMQ, SNS/SQS, EventBridge, NATS, and most pub/sub systems.
Why Event Contracts Break Differently
Request-response APIs fail loudly
With HTTP or RPC, the caller and server meet at a direct boundary. If the server removes a field, changes a status code, or rejects a request body, the caller usually fails immediately. That does not make API compatibility easy, but the feedback loop is short.
Events are different. Producers publish facts about something that already happened. Consumers process those facts later, often in separate deploy windows and sometimes from replayed streams. The failure can be delayed, partial, or hidden inside a dead-letter queue.
Events carry both structure and meaning
A useful event contract is not only a list of fields. It should explain what the event means and which fields are stable enough for consumers to trust.
For example, these two events may both pass a loose JSON parser:
{
"type": "invoice.paid",
"version": 1,
"occurredAt": "2026-05-25T18:30:00.000Z",
"data": {
"invoiceId": "inv_123",
"accountId": "acct_456",
"amountCents": 4900,
"currency": "EUR"
}
}
{
"type": "invoice.paid",
"version": 1,
"occurredAt": "2026-05-25T18:30:00.000Z",
"data": {
"invoiceId": "inv_123",
"accountId": "acct_456",
"amount": 49,
"currency": "EUR"
}
}
The second event looks reasonable to a human, but it breaks any consumer expecting minor units in amountCents. Contract testing should catch this before the producer reaches production.
Compatibility is a product decision
Not every change is breaking. Adding an optional field is usually safe. Removing a field, renaming a field, changing units, changing enum values, or changing delivery semantics is usually risky. The team needs explicit compatibility rules so reviewers do not have to rediscover them on every pull request.
Define the Contract as an Artifact
Keep the contract close to the message
An event contract should be easy to find from both the producer and consumer repositories. In a monorepo, that might be a shared package. In a multi-repo organization, it might be a small contract repository, a schema registry, or an artifact published through CI.
A lightweight contract can start as a versioned JSON file:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "invoice.paid.v1",
"title": "InvoicePaid",
"type": "object",
"required": ["type", "version", "occurredAt", "data"],
"additionalProperties": false,
"properties": {
"type": { "const": "invoice.paid" },
"version": { "const": 1 },
"occurredAt": { "type": "string", "format": "date-time" },
"data": {
"type": "object",
"required": ["invoiceId", "accountId", "amountCents", "currency"],
"additionalProperties": false,
"properties": {
"invoiceId": { "type": "string", "minLength": 1 },
"accountId": { "type": "string", "minLength": 1 },
"amountCents": { "type": "integer", "minimum": 0 },
"currency": { "type": "string", "pattern": "^[A-Z]{3}$" }
}
}
}
}
JSON Schema is not the only valid format. Some teams prefer AsyncAPI for richer channel documentation, Avro or Protobuf for typed serialization, or TypeScript types for internal services. The key is that the contract is treated as a versioned artifact, not an informal wiki page.
Add examples to remove ambiguity
Schemas describe structure. Examples describe intent. Keep one or two canonical event examples beside the schema and use them in tests:
{
"type": "invoice.paid",
"version": 1,
"occurredAt": "2026-05-25T18:30:00.000Z",
"data": {
"invoiceId": "inv_123",
"accountId": "acct_456",
"amountCents": 4900,
"currency": "EUR"
}
}
Examples are especially useful for fields whose shape is technically valid but semantically easy to misuse, such as currency amounts, timestamps, regional identifiers, and lifecycle states.
Test the Consumer's Assumptions
Make assumptions executable
Consumer contract tests should prove that a consumer can process the events it claims to support. The goal is not to test the broker. The goal is to pin down the assumptions the consumer has about event shape and meaning.
Start by modelling the event type:
type InvoicePaidV1 = {
type: "invoice.paid";
version: 1;
occurredAt: string;
data: {
invoiceId: string;
accountId: string;
amountCents: number;
currency: string;
};
};
type LedgerEntry = {
invoiceId: string;
accountId: string;
amountCents: number;
currency: string;
bookedAt: Date;
};
Then write the consumer behavior around that contract:
export function toLedgerEntry(event: InvoicePaidV1): LedgerEntry {
if (event.type !== "invoice.paid" || event.version !== 1) {
throw new Error("Unsupported invoice event");
}
if (!Number.isInteger(event.data.amountCents)) {
throw new Error("Invoice amount must be expressed in cents");
}
return {
invoiceId: event.data.invoiceId,
accountId: event.data.accountId,
amountCents: event.data.amountCents,
currency: event.data.currency,
bookedAt: new Date(event.occurredAt)
};
}
The contract test uses the canonical event, not a hand-written mock that drifts away from reality:
import assert from "node:assert/strict";
import { test } from "node:test";
import { toLedgerEntry } from "./ledger-consumer";
import invoicePaidExample from "../contracts/invoice.paid.v1.example.json" assert {
type: "json"
};
test("ledger service accepts invoice.paid v1", () => {
const entry = toLedgerEntry(invoicePaidExample);
assert.equal(entry.invoiceId, "inv_123");
assert.equal(entry.amountCents, 4900);
assert.equal(entry.currency, "EUR");
});
This is intentionally narrow. It does not spin up Kafka. It does not call the invoice service. It tells the team, "If an invoice.paid event looks like the contract example, this consumer still knows what to do."
Test negative cases too
Consumer assumptions include what the consumer rejects. If the system promises amountCents, write a test that proves amount is not silently accepted:
test("ledger service rejects invoice.paid events without amountCents", () => {
const invalidEvent = {
...invoicePaidExample,
data: {
invoiceId: "inv_123",
accountId: "acct_456",
amount: 49,
currency: "EUR"
}
};
assert.throws(() => toLedgerEntry(invalidEvent as never), {
message: /amount/i
});
});
Negative tests are useful because many event breaks are not obvious crashes. A consumer might default a missing value to zero, skip a branch, or write corrupted data. Contract tests should make those failures explicit.
Verify the Producer Before Publishing
Producers need contract tests too
The producer owns the event. It should verify that any event it publishes conforms to the contract. A simple pattern is to keep event construction in one module and test that module directly.
type Invoice = {
id: string;
accountId: string;
totalCents: number;
currency: string;
paidAt: Date;
};
export function buildInvoicePaidEvent(invoice: Invoice): InvoicePaidV1 {
return {
type: "invoice.paid",
version: 1,
occurredAt: invoice.paidAt.toISOString(),
data: {
invoiceId: invoice.id,
accountId: invoice.accountId,
amountCents: invoice.totalCents,
currency: invoice.currency
}
};
}
A producer contract test can then check the event against the canonical shape:
import assert from "node:assert/strict";
import { test } from "node:test";
import { buildInvoicePaidEvent } from "./invoice-events";
function assertInvoicePaidV1(event: unknown): asserts event is InvoicePaidV1 {
const candidate = event as Partial<InvoicePaidV1>;
assert.equal(candidate.type, "invoice.paid");
assert.equal(candidate.version, 1);
assert.equal(typeof candidate.occurredAt, "string");
assert.equal(typeof candidate.data?.invoiceId, "string");
assert.equal(typeof candidate.data?.accountId, "string");
assert.equal(typeof candidate.data?.amountCents, "number");
assert.equal(typeof candidate.data?.currency, "string");
}
test("invoice service publishes invoice.paid v1", () => {
const event = buildInvoicePaidEvent({
id: "inv_123",
accountId: "acct_456",
totalCents: 4900,
currency: "EUR",
paidAt: new Date("2026-05-25T18:30:00.000Z")
});
assertInvoicePaidV1(event);
assert.deepEqual(event.data, {
invoiceId: "inv_123",
accountId: "acct_456",
amountCents: 4900,
currency: "EUR"
});
});
In a production codebase, you would usually replace the small hand-written assertion with a schema validator, generated types, or registry-backed compatibility checks. The principle stays the same: event construction is tested before the broker ever sees the message.
Automate Compatibility in CI
Run producer and consumer checks independently
Good event contract testing lets teams move independently without guessing. A typical pipeline looks like this:
flowchart LR
producer["Producer PR"] --> producerTest["Validate emitted event"]
producerTest --> contract["Versioned event contract"]
consumer["Consumer PR"] --> consumerTest["Replay contract examples"]
contract --> consumerTest
contract --> compatibility["Compatibility gate"]
producerTest --> compatibility
consumerTest --> compatibility
compatibility --> deploy["Deploy when safe"]
The producer verifies it still emits valid events. The consumer verifies it still handles the contract examples. A compatibility gate checks whether contract changes follow the rules the team agreed on.
Classify changes before arguing about them
Event changes become much easier to review when compatibility categories are explicit:
| Change | Usually safe? | Notes |
|---|---|---|
| Add an optional field | Yes | Consumers should ignore fields they do not need. |
| Add a required field | No | Old producers cannot provide it, and old consumers may not expect it. |
| Remove a field | No | Any consumer reading it can break. |
| Rename a field | No | Treat as add plus remove unless both versions are supported. |
| Add an enum value | Maybe | Safe only if consumers have a default branch. |
| Change units | No | amountCents to amount is a semantic break. |
| Add a new event version | Yes | Safe when old versions remain supported during migration. |
These rules should live near the contract. Otherwise, every review turns into a debate about whether "small" means "safe."
Version events deliberately
For most systems, an event version belongs in the payload or metadata:
{
"type": "invoice.paid",
"version": 2,
"occurredAt": "2026-05-25T18:30:00.000Z",
"data": {
"invoiceId": "inv_123",
"accountId": "acct_456",
"amount": {
"minorUnits": 4900,
"currency": "EUR"
}
}
}
Versioning is not a license to break consumers casually. It is a migration tool. A safe rollout often means producing both versions for a while, moving consumers one by one, and then retiring the old version after replay windows and retention policies are understood.
Operational Pitfalls to Watch
The broker is not the contract
Kafka topics, queues, and event buses route messages. They do not automatically explain what a message means. Even platforms with schema enforcement need human-readable examples, ownership, and compatibility rules.
Keep these questions answered:
- Who owns each event type?
- Which consumers are known?
- Which fields are stable public contract fields?
- How long can old events be replayed?
- What happens when a consumer receives an unknown version?
Do not overfit to a single consumer
Consumer-driven contracts are powerful, but event streams often have multiple consumers. A producer should not optimize an event only for the first downstream service. The contract should describe the business fact, while each consumer test captures how that service uses the fact.
If a field is needed only by one consumer and has no domain meaning, consider whether it belongs in that event at all. Sometimes the right answer is a different event, a query back to the source service, or a projection owned by the consumer.
Validate at runtime for defense in depth
CI should catch contract breaks before deployment, but runtime validation still matters. Producers can log or reject malformed events before publishing. Consumers can quarantine unsupported events and emit clear metrics.
The goal is not to make runtime validation your first line of defense. The goal is to make failures explainable when bad data arrives from a manual replay, an old service version, or an unexpected integration.
Conclusion and Next Steps
Contract testing gives event-driven systems a sharper feedback loop. Instead of waiting for a consumer to fail in production, teams can verify event shape, semantics, and compatibility during normal pull request checks.
Start small: choose one important event, write down its schema and canonical examples, add a consumer test that replays those examples, and add a producer test that verifies emitted events. Once that loop is working, add compatibility rules and versioning guidelines. The payoff is not only fewer broken deployments. It is a shared language for evolving event-driven systems without making every service move in lockstep.