Announcing the Debezium-Compatible Kafka Target for Supermetal
Drop-in Debezium replacement for Kafka CDC. Same messages, same schemas.
We've tracked change data capture since Bottled Water was first introduced, following the development of Debezium on Kafka Connect. Debezium has become synonymous with CDC and powers critical real-time data infrastructure at thousands of companies. It's also become the core engine in other open source projects like Apache Flink CDC.
Debezium's extensibility has made it the foundation for CDC everywhere. Kafka Connect, Debezium Server, the embedded engine. But we've also seen the operational reality firsthand.
Production Reality
Running Debezium in production requires ongoing attention. The infrastructure stack requires specialized expertise. Kafka brokers, Connect workers, schema registries all need careful management. High-volume pipelines demand continuous tuning to keep up with change event spikes.
Teams operating Debezium at scale often dedicate multiple engineers solely to maintaining these pipelines. Initial snapshots of large tables can take hours or days. Schema changes require careful coordination. When connectors fail, recovery typically means manual intervention.
This is the operational reality of CDC at scale.
Supermetal as an Alternative
Supermetal now outputs to Kafka as a drop-in replacement for Debezium. Your existing consumers work unchanged. Same messages, same schemas, minimal to zero migration effort.
Two message formats:
- Debezium: Fully compatible envelope format with before/after images and source metadata
- Supermetal: Compact upsert format with minimal post-processing
Capabilities:
- JSON and Avro serialization with Confluent Schema Registry support
- Support for Debezium configuration options including tombstones, headers, decimal/binary/time precision handling, and transaction metadata
- Optional transactional writes aligned to source database transaction boundaries
Debezium Compatible Format
{
"payload": {
"before": null,
"after": {
"id": 1,
"first_name": "Anne",
"last_name": "Kretchmar",
"email": "[email protected]"
},
"source": {
"version": "3.4.1.Final",
"connector": "postgresql",
"name": "PostgreSQL_server",
"ts_ms": 1559033904863,
"ts_us": 1559033904863123,
"ts_ns": 1559033904863123000,
"snapshot": true,
"db": "postgres",
"sequence": "[\"24023119\",\"24023128\"]",
"schema": "public",
"table": "customers",
"txId": 555,
"lsn": 24023128,
"xmin": null
},
"op": "c",
"ts_ms": 1559033904863,
"ts_us": 1559033904863841,
"ts_ns": 1559033904863841257
}
}Supermetal Format
{
"id": 1,
"first_name": "Anne",
"last_name": "Kretchmar",
"email": "[email protected]",
"_sm_version": 1559033904863841257,
"_sm_deleted": false
}Correctness
Supermetal prioritizes correctness above all. Our testing includes end-to-end type fuzzing from source to target database and sqllogictest style test suites, more details published soon.
We run source databases through Supermetal into a target database, then assert the replicated data matches exactly. For Kafka, we run Debezium alongside Supermetal and compare the output. The results are often byte-for-byte identical and always semantically equivalent, accounting for minor serialization differences.
Performance
Supermetal separates compute from IO. Per message serialization to JSON or Avro is CPU intensive. By moving serialization batches to a work-stealing pool, we use all available CPU cores for compute while IO proceeds independently.
A single Rust binary with predictable memory usage. No JVM, no GC pauses, no snapshot locks, no connector state in Kafka topics. Initial syncs that take hours or days with Debezium complete in a fraction of the time.
Sub-second latency up to 15K rows/sec. Above 20K/sec, single-threaded Postgres logical decoding saturates. The Breakdown view shows Read Latency is 98% of total at peak load.
Getting Started
curl -fsSL https://trial.supermetal.io/install.sh | shiwr -useb https://trial.supermetal.io/install.ps1 | iex