On the Data Platform team at SeatGeek, our goal is to make producing and consuming data products a delightful experience. To achieve this, we transformed part of our data stack to be more real-time, performant, and developer-centric.
This post explores a modern approach to the transactional outbox pattern, highlighting key design decisions, challenges we addressed, and how Apache Kafka®, Postgres, Debezium, and schemas enabled seamless real-time data distribution.
Why real-time data matters to SeatGeek
Leveraging real-time data is crucial in the live events industry. From schedule announcements to last-minute ticket on-sales, things move quickly, and agility is key. Companies that react swiftly can deliver a superior fan experience. Real-time data enables SeatGeek to personalize the fan journey, provide live, actionable insights to stakeholders, and optimize operations by dynamically responding to demand signals.
A new approach to data distribution
Our vision for data distribution is to establish a continuous loop – from production to consumption, enrichment, and feedback to the source. This article focuses on one part of that loop: data production.
Goals
To guide our efforts and stay aligned with the overall vision, we set the following goals.
- Guarantee data integrity and consistency.
- Achieve scalable, predictable performance.
- Encourage domain-driven, reusable data that supports diverse uses.
- Foster shared ownership through domain-driven design and data contracts.
- Prioritize developer experience.
Existing approaches for producing data
Before introducing a new approach, we evaluated the existing approaches to producing data to ensure we weren’t duplicating work.
1. Polling Consumers
In this approach, consumers periodically check for updates in the data, through an API or direct database access. This pattern is common in ETL (Extract, Transform, Load) workflows. A notable example is Druzhba, our open-source tool for data warehouse ETL.
Key Limitations
- Introduces inherent latency, especially when updates are frequent.
- Places unpredictable load on source systems.
- Direct database access violates service boundaries.
- Reintegration overhead for each new consumer.
2. Direct Publish
The next strategy involves publishing data directly to consumers. A classic example of this is publishing user activity data, also known as clickstreams, to Kafka. In this case, the publish step does not need to be transactional, however, there are instances where it must be. For example, when an order is created and the inventory needs to be updated, ensuring that both actions occur together is crucial, so you want updates to be all-or-nothing. In such cases, partial failures can lead to inconsistencies between systems, undermining trust in the data.
Key Limitations
- Dual-write problem: Lack of transactional guarantees introduces data inconsistencies.
- Ensuring consistency across systems is hard, and techniques like distributed transactions/event sourcing add unnecessary complexity.
- Increased operational complexity to address consistency between systems.
3. Change Data Capture
Lastly, Change Data Capture allows us to subscribe to all database updates and propagate those changes downstream. Because changes are automatically captured at the database level, we avoid inconsistencies that can arise from dual writes.
Key limitations
- Shifts complexity to consumers.
- Varying implementations across consumers lead to inconsistencies.
- Data format changes are not detected, increasing the likelihood of downstream breakages.
- This is a variation of direct database access and hence violates service boundaries.
In the end, we concluded that none of the existing approaches would help us achieve our goals, so we sought a new solution.
Picking a solution
We began by drafting an RFC (Request for Comments) document to outline the proposed approach and evaluate alternatives – a standard practice at SeatGeek that enables us to gather stakeholder feedback and make informed decisions. For our real-time data distribution needs, we focused on four key areas as shown in the image below.
After reviewing the alternatives for each area, we settled on the transactional outbox pattern for its simplicity and effectiveness in addressing the dual-write problem while ensuring data integrity. We opted to have applications drive the event publishing to take advantage of domain events, which are best defined at the source. For relaying messages, we chose transaction log-tailing with Debezium, an established tool that efficiently captures changes from the database. Finally, we selected Kafka as our message broker primarily for its reliability as a log store, which enables us to reuse data effectively. Additionally, since we were already using Kafka in our infrastructure, it made sense to leverage it. We also decided to enforce the usage of schemas to promote shared ownership of the data.
A modern twist on the transactional outbox pattern
Before diving into the twist, let’s briefly recap the traditional transactional outbox pattern with an example. Imagine an application needs to publish an event related to a customer order. This process might involve updates to the sales
, inventory
, and customers
tables. We’ll also assume the use of Postgres, Debezium, and Kafka. The following steps occur, as illustrated in the diagram below.
- Construct a domain event: The application creates a domain event that includes relevant information about the order, such as the number of items purchased, and customer details.
- Insert the event into the outbox table: The domain event is written to a dedicated “outbox” table, a temporary storage location for events. This step occurs within the same database transaction as the updates to the
sales
,inventory
, andcustomers
tables. - Commit the transaction: Once the transaction is committed successfully, the changes are recorded in the Postgres write-ahead log (WAL).
- Relay and publish: Debezium captures all changes to the outbox table, so when it detects a new entry in the outbox table, it relays that event to Kafka for downstream consumers to process.
Challenges with the single outbox table
Early during the RFC process, we identified potential challenges with using a single outbox table, prompting us to explore alternatives to improve performance and scalability.
Performance impact
- A single outbox table for an entire database can become the bottleneck, especially with high write throughput.
- Lock contention is also a significant risk, as multiple concurrent writes compete for access to the table.
Complexity with table clean-up
- Managing the size of the outbox table over time is crucial and requires a separate, external cleanup process.
- Having an aggressive cleanup process risks lock contention between inserts and deletes.
- Conservative cleanup could lead to an ever-growing outbox table, which increases the likelihood of performance degradation.
Bypassing the outbox table
Fortunately, we discovered that Postgres has a really neat feature: skip the outbox table and write to the write-ahead-log (WAL) directly! The WAL is central to Postgres’s durability and performance. It logs all changes before they are applied to data files, ensuring that transactions are committed reliably. WAL entries are sequential and optimized for high write throughput.
Writing directly to the WAL is made possible through logical decoding, a mechanism that allows Postgres to stream SQL changes to external consumers, and is also how Postgres uses to replicate changes from the primary to replicas.
Postgres provides a built-in function, pg_logical_emit_message()
(documented here), for writing custom data to the WAL as part of a transaction. We will later see how we leverage this functionality to emit domain events, along with metadata, from applications to Kafka.
Implementation: From Skateboard to Car
We adopted an incremental approach, starting with a simple, low-risk solution and gradually building toward a more full-fledged, production-ready system. Following this “Skateboard to Car” philosophy allowed us to experiment and validate our assumptions early on, and move with a higher velocity. This also allowed us to incorporate feedback from users, and minimize risk.
Proof of concept (aka the “Skateboard”)
Before rolling out a full-scale implementation, we wanted to uncover the unknowns and build confidence in our approach. To do this, we:
- Enabled logical replication on one of our most active databases.
- Set up Debezium Server.
- Integrated
pg_logical_emit_message()
in a high-traffic, critical request path. - Simulated an on-sale scenario to load-test this setup at a much higher scale than normal.
Results
The initial results were promising:
- Excellent performance: Debezium delivered exceptional throughput under heavy load.
- Minimal database impact: Writing to the WAL directly introduced negligible overhead, even during high write throughput.
Challenges
However, we also uncovered some key challenges:
-
Debezium’s Outbox Event Router
- Debezium has an Outbox Event Router that routes events from the database to specific Kafka topics.
- It assumes the existence of an outbox table, which we were bypassing with our direct write-to-WAL approach.
- It does not integrate with the schema registry, necessitating a custom solution to handle message routing.
-
Debezium Server JSONSchema support
- Debezium Server did not support JSON format with schemas enabled.
- In its absence, the default behavior was base64 encoding entire records, which exposed internal structures and added unnecessary complexity for consumers.
- This limitation made Debezium Server unsuitable for our use case, leading us to use Kafka Connect® instead.
Rolling out a full-scale implementation (aka the “Bicycle”)
The implementation involves 4 core components:
- The Kafka Connect cluster(s) for Debezium
- Data contracts and schemas
- A library for applications to produce data
- A custom Single Message Transformation (SMT) to route messages to their appropriate destination
1. Kafka Connect and Debezium
We run Debezium as a source connector on Kafka Connect rather than using Debezium Server, because of the previously noted limitations with schemas. The Kafka Connect clusters run distributed workers on Kubernetes, to ensure high availability and scalability. Additionally, the clusters for transactional outbox are completely isolated from the rest of our Kafka Connect clusters.
2. Data contracts and schemas
Schemas play a foundational role in our data infrastructure, serving as contracts between producers and consumers. They define the structure of the data being exchanged, ensuring data integrity, compatibility, and decoupling.
We use Confluent Schema Registry for managing Kafka schemas across the company. It validates data at the source, ensuring only well-structured data is published. Schema management tools in our CI/CD workflows automate the detection, validation, and migration of schema changes. This eliminates the risk of breaking downstream consumers while maintaining seamless integration across teams.
Schema ownership has cultivated a forward-thinking culture at SeatGeek, where developers consider the evolution of data contracts when designing applications. This shared responsibility enables us to scale our systems confidently while maintaining data quality. While developers maintain responsibility for schema compatibility, the Data Platform team supports this with tools and guidance to simplify the process.
Schemas are a requirement for using the transactional outbox workflow, and are tightly integrated into the library. Messages are validated using the schema registry before writing to the WAL. Downstream consumers use the same schema for deserialization, ensuring consistency throughout the pipeline.
3. Producing data from applications
Our approach here was to create a library that applications could integrate. We started with a library for Python – the most popular language for services at SeatGeek. Since then, we’ve added support for C#, and plan to support Golang as well. The key features of the library are:
- Schema validation and serialization using the Confluent Schema Registry.
- Writing directly to the WAL using
pg_logical_emit_message()
. - Custom metadata injection such as tracing context and message headers as part of message prefix.
The centralized nature of the library ensures that schema validation and WAL writes are standardized across the organization. Developers cannot bypass the schema registry, ensuring consistency and reliability.
Examples and code snippets
pg_logical_emit_message()
componentsExample: Using the outbox library in an application
from transactional_outbox import Outbox
outbox = Outbox()
@app.post("/some/api/route")
async def handler(req):
# -- Handle the request --
# - open a database connection
# - execute business logic
# -- Publish data --
# - write to the WAL
await outbox.write_to_outbox(
session = session, # the database connection
kafka_topic = "...", # The Kafka topic to write to
kafka_key = "...", # An optional key for the message
message: dict = { ... }, # the message to write to the WAL
kafka_headers: dict = { ... } # custom Kafka headers
)
# -- Flush the data --
session.commit()
# -- End --
Snippet: Write to outbox function
async def write_to_outbox(
self,
session: Union[Session, AsyncSession],
message: dict[str, Any],
kafka_topic: str,
kafka_key: Optional[str] = None,
kafka_headers: Optional[dict[str, str]] = None,
span_context: Optional[dict[str, str]] = None,
commit: bool = False,
) -> None:
# Validate that the DB session is active...
# Serialize the message using the schema registry
serialized_message: bytes = serialize_msg_schema_registry(
topic_name=kafka_topic,
message=message,
sr_client=self.registry_client,
)
# Build the prefix metadata (Kafka topic, key, headers, tracing context, etc.)
prefix_obj = MessagePrefix(
kafka_topic=kafka_topic,
kafka_key=kafka_key,
kafka_headers=kafka_headers or {},
span_context={} # Add tracing or other metadata here if needed
)
serialized_prefix: str = prefix_obj.serialize()
# Use pg_logical_emit_message() to write to the WAL
statement = select(
func.pg_logical_emit_message(
True,
serialized_prefix,
serialized_message,
)
)
await self.execute_session(
session=session,
prepared_statement=statement,
)
4. Routing messages using Single Message Transforms (SMTs)
SMTs are a feature of Kafka Connect that enables users to modify or transform messages as they pass through the pipeline. We built an SMT to replace Debezium’s built-in outbox event router, with support for the schema registry.
How it works
- The SMT distinguishes between heartbeat records and outbox records based on the Connect schema name (
io.debezium.connector.common.Heartbeat
andio.debezium.connector.postgresql.MessageValue
, respectively). - Heartbeat records are passed along with no modification.
- For outbox records, the prefix field embedded in each message contains metadata like which topic the message should be routed to.
- The span context and headers from the metadata are moved into the output record’s headers.
- Additional telemetry data such as end-to-end latency is emitted by the SMT based on source metadata that Debezium includes by default.
- The content of the outbox record is preserved and emitted as raw bytes (Note: This requires the use of Kafka Connect’s
ByteArrayConverter
).
Building a robust system (aka the “Car”)
The next step in our journey was to make the whole system fault-tolerant and easy to manage.
Adding heartbeat events
As Debezium processes logical decoding messages, it reports the last successfully consumed message back to Postgres, which tracks this in the pg_replication_slots
table. WAL segments containing unprocessed messages will be retained on disk, so Debezium must remain active to prevent disk bloat.
Additionally, the number of retained WAL segments depends on overall database activity, not just the volume of outbox messages. If database activity is significantly higher, PostgreSQL retains more WAL segments, causing unnecessary disk usage.
To address this, we set up Debezium to periodically emit heartbeat events by updating a dedicated heartbeat table and consuming the resulting changes. These events are also published to a separate Kafka topic, allowing us to monitor Debezium’s progress and connectivity.
Heartbeat events serve two purposes:
1. Advancing the replication slot position for WAL cleanup:
- Heartbeats are periodically committed to the WAL alongside regular database updates.
- When Debezium processes these heartbeats, it advances its replication slot position to reflect the latest WAL segment it has consumed.
- This update signals Postgres that all WAL segments before this point are no longer needed, marking it for safe removal from the disk.
2. Monitoring Debezium connectivity:
- Debezium is configured to periodically execute a query on the source database, ensuring it remains active.
- Debezium’s health can be monitored by checking the recency of heartbeat messages in the dedicated Kafka heartbeats topic.
Automatic connector restarts
One thing we noticed was that connectors occasionally failed, either due to a network partition or expired database credentials, and had to be restarted.
To reduce the need for human intervention, we wrote a script that uses the Connect API to check the status of connectors on the pod. The script is executed as a livenessProbe
on Kubernetes, and anytime the probe fails, Kubernetes restarts the container, which also restarts the connector.
Note: This had to be a livenessProbe
and not readinessProbe
because the Connect API doesn’t become available until the readinessProbe
succeeds.
Observability
Adding observability through distributed tracing was a key part of empowering users to visualize and inspect the flow of data throughout the system. Each stage emits telemetry data that is all tied together within Datadog, the observability platform we use at SeatGeek.
At the application level, the library adds a new span to the current trace context. This context is injected as metadata into the prefix object of the WAL message. The SMT then relocates that data to the Kafka message headers. Any consumers of the topic will also inherit the trace context. When this is all put together on Datadog, we’re able to visualize the flow of data from the origin (for ex, an HTTP request), all the way down to its final destination (for ex, a Flink job).
Learnings and Looking Ahead
As we reflect on this journey, here are some key insights we’ve gathered and areas for future improvement.
Key Learnings
- Development Effort: This new approach requires time and effort to develop, set up, and maintain.
- Dependence on Observability: We’ve also learned that we heavily rely on observability tools to help us troubleshoot and ensure everything works correctly.
- Ease of Adoption: This approach is easier to use in new projects. Retrofitting it into existing systems is much more difficult, especially when working with snapshots.
- Cross-Team Collaboration: Close collaboration between application teams is crucial to success.
Future Areas of Focus
- Snapshot Handling: Enhancing support for snapshots (or backfills) will be a priority, as they are essential for data recovery and completeness.
- Schema registry developer experience: We aim to simplify the creation and maintenance of schemas.
- Consumption Experience: Improving the data consumption side is critical. We’re focusing on stream processing with Flink to unlock advanced use cases and make working with real-time data more seamless.
- Automation and Tooling: Investing in automation and developer tools to simplify setup and maintenance will help reduce friction and increase adoption.