This content originally appeared on DEV Community and was authored by Alex Aslam
Hey, Let's talk about a problem I know you've all wrestled with. You're building a sleek, modern Rails application. You've embraced microservices, event-driven architecture, or just need to keep an Elasticsearch index in sync. Your tool of choice is probably Kafka or RabbitMQ.
The code seems straightforward:
# app/services/order_creation_service.rb
class OrderCreationService
def call(user, product)
Order.transaction do
order = Order.create!(user: user, product: product)
# ... maybe some other logic ...
# Uh oh. The dangerous part.
EventPublisher.publish('order_created', order.to_event_payload)
end
end
end
You push to production, and for a while, it's glorious. Then, it happens.
A midnight PagerDuty alert. The EventPublisher
raised a ConnectionTimeout
error after the database transaction committed. The order was saved, but the 'order_created' event was never published. The shipping service is oblivious. The analytics dashboard is missing data. The customer gets an email saying "Your order is confirmed!" but nothing happens. Cue the support tickets.
You've just been bitten by the dual-write problem. We're trying to write to two different systems—the database and the message broker—within one logical transaction, but there's no atomicity across them. It's a distributed systems fallacy to think this could ever be 100% reliable.
So what are our options?
- Ignore it. (Spoiler: Not an option).
- Two-Phase Commit (2PC)? A complex and often slow protocol that's rarely the right answer for high-throughput applications.
- The Outbox Pattern. This is the way.
The Core Idea: A Transactional Outbox
The Outbox Pattern elegantly sidesteps the dual-write problem by reducing it to a single write within the database transaction. Instead of publishing directly to Kafka, we write the event to a special table in the same database as part of the transaction.
A separate process then reads from this table and reliably publishes the messages to the message broker.
Here’s the new, bulletproof flow:
# app/services/order_creation_service.rb
class OrderCreationService
def call(user, product)
Order.transaction do
order = Order.create!(user: user, product: product)
# 1. Write the event to the outbox table INSIDE the transaction.
OutboxMessage.create!(
topic: 'orders',
event_name: 'order_created',
payload: order.to_event_payload # Serialized JSON, usually
# ^^ This create! is part of the same DB transaction.
)
end
# 2. The transaction commits. Both the order and the outbox message are persisted.
# 3. A separate publisher process fetches the new outbox message and publishes it to Kafka.
end
end
If the transaction fails, the OutboxMessage
is rolled back. If it succeeds, we have a permanent, durable record that an event needs to be published. The fate of the event is now tied to the fate of the database transaction, which is exactly what we want.
Building the Outbox in Rails: A Senior Engineer's Blueprint
Let's move beyond theory and build a production-ready implementation. We need three components:
1. The Migration and Model
# db/migrate/20231020120000_create_outbox_messages.rb
class CreateOutboxMessages < ActiveRecord::Migration[7.0]
def change
create_table :outbox_messages do |t|
t.string :topic, null: false # e.g., 'orders', 'users'
t.string :event_name, null: false # e.g., 'created', 'updated'
t.jsonb :payload, null: false
t.datetime :published_at, index: true # Null means not yet published
t.text :error_message
t.timestamps
# Index for the publisher to find unpublished messages efficiently.
t.index [:topic, :published_at, :id]
end
end
end
# app/models/outbox_message.rb
class OutboxMessage < ApplicationRecord
# Basic validations
validates :topic, :event_name, :payload, presence: true
# Scope for the publisher to use
scope :unpublished, -> { where(published_at: nil) }
scope :for_topic, ->(topic) { where(topic: topic) }
# Mark this message as successfully published
def mark_as_published!
update!(published_at: Time.current)
end
# Mark this message as failed (for logging and retry logic)
def mark_as_failed!(error)
update!(error_message: "#{error.class}: #{error.message}")
end
end
2. The Reliable Publisher Process
This is the crucial piece. The publisher runs in a separate process (e.g., a Rails runner, a Sidekiq job, or a dedicated binary). Its job is to poll the outbox_messages
table and publish messages.
Important: This process must be idempotent and designed for at-least-once delivery.
# lib/outbox_publisher.rb
class OutboxPublisher
BATCH_SIZE = 100
POLL_INTERVAL = 5.seconds
def run
loop do
unpublished_messages = OutboxMessage.unpublished.limit(BATCH_SIZE).order(:id).to_a
if unpublished_messages.any?
publish_batch(unpublished_messages)
else
sleep POLL_INTERVAL
end
end
end
private
def publish_batch(messages)
messages.each do |message|
# Wrap each publish in its own transaction so a failure on one
# doesn't block the entire batch.
OutboxMessage.transaction do
# Use `lock!` to prevent another publisher process from
# trying to handle the same message. This is our guarantee
# against duplicate publishing *within* a single batch process.
message.lock!
# Skip if another process already published it while we were waiting.
next if message.published_at?
begin
# This is where you'd use your Kafka/RabbitMQ client.
# e.g., WaterDrop.producer.produce_sync(...)
publish_to_broker(message.topic, message.event_name, message.payload)
message.mark_as_published!
rescue => e
message.mark_as_failed!(e)
Rails.logger.error("Failed to publish OutboxMessage #{message.id}: #{e.message}")
# Don't re-raise. Continue with the next message in the batch.
end
end
end
end
def publish_to_broker(topic, event_name, payload)
# Your actual broker-specific implementation goes here.
# For example, with `rdkafka-ruby` or `bunny`.
$kafka_producer.produce(
topic: topic,
payload: { event: event_name, data: payload }.to_json,
key: payload['id'] # Often a good idea for Kafka partitioning
)
$kafka_producer.deliver_messages # If using synchronous delivery
end
end
You'd run this as a separate process: bin/rails runner -e production OutboxPublisher.new.run
.
3. (Optional) The Sidekiq Scheduler
For better scalability than a simple poller, you can use a scheduled Sidekiq job.
# app/jobs/outbox_publish_job.rb
class OutboxPublishJob < ApplicationJob
queue_as :low
def perform
unpublished_messages = OutboxMessage.unpublished.limit(100).order(:id).to_a
# ... same logic as `publish_batch` above ...
# Schedule itself to run again in 5 seconds if there might be more work
OutboxPublishJob.perform_in(5.seconds) if OutboxMessage.unpublished.exists?
end
end
Leveling Up: Considerations for the Senior Dev
This is a solid foundation, but a true production system needs more.
- Idempotent Consumers: The Outbox Pattern guarantees at-least-once delivery. Your consumers must be idempotent. Use a unique ID (e.g., the
outbox_messages.id
or an event ID in the payload) to deduplicate processing on the consumer side. - Monitoring and Alerting: Monitor the lag of your publisher. How many
unpublished
messages are there? Alert if the number grows continuously, indicating the publisher is failing or can't keep up. - Batching & Performance: The example uses a batch size of 100. Tune this. For Kafka, you can use async producers and flush the producer after each batch for much higher throughput.
- Schema Evolution: The
payload
is JSON. Use a schema registry (e.g., Apache Avro) or at least include apayload_version
field to manage backward-compatible changes to your event structures. - Database Load: The publisher adds read load to your primary database. In extreme cases, you might need to read from a replica, but be mindful of replication lag causing delayed event publishing.
Conclusion: Why Bother?
The Outbox Pattern requires more setup than a naive EventPublisher.publish
call. So, is it worth it?
Absolutely. It trades a bit of initial complexity for a massive gain in data consistency and system reliability. It moves the problem of message delivery from the critical path of your web request to a background process that can be monitored, retried, and scaled independently. It's the difference between hoping your events get through and knowing they will.
It's a pattern that speaks to the maturity of your engineering team and the robustness of your architecture.
This content originally appeared on DEV Community and was authored by Alex Aslam

Alex Aslam | Sciencx (2025-08-26T19:03:36+00:00) Taming the Beast: The Outbox Pattern for Reliable Event Publishing. Retrieved from https://www.scien.cx/2025/08/26/taming-the-beast-the-outbox-pattern-for-reliable-event-publishing/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.