Back to Blog

Announcing the Debezium-Compatible Kafka Target for Supermetal

Drop-in Debezium replacement for Kafka CDC. Same messages, same schemas.

Wasif Aleem, Co-Founder

We've tracked change data capture since Bottled Water was first introduced, following the development of Debezium on Kafka Connect. Debezium has become synonymous with CDC and powers critical real-time data infrastructure at thousands of companies. It's also become the core engine in other open source projects like Apache Flink CDC.

Debezium's extensibility has made it the foundation for CDC everywhere. Kafka Connect, Debezium Server, the embedded engine. But we've also seen the operational reality firsthand.

Production Reality

Running Debezium in production requires ongoing attention. The infrastructure stack requires specialized expertise. Kafka brokers, Connect workers, schema registries all need careful management. High-volume pipelines demand continuous tuning to keep up with change event spikes.

Teams operating Debezium at scale often dedicate multiple engineers solely to maintaining these pipelines. Initial snapshots of large tables can take hours or days. Schema changes require careful coordination. When connectors fail, recovery typically means manual intervention.

This is the operational reality of CDC at scale.

Supermetal as an Alternative

Supermetal now outputs to Kafka as a drop-in replacement for Debezium. Your existing consumers work unchanged. Same messages, same schemas, minimal to zero migration effort.

Two message formats:

  • Debezium: Fully compatible envelope format with before/after images and source metadata
  • Supermetal: Compact upsert format with minimal post-processing

Capabilities:

Debezium Compatible Format

{
    "payload": {
        "before": null,
        "after": {
            "id": 1,
            "first_name": "Anne",
            "last_name": "Kretchmar",
            "email": "[email protected]"
        },
        "source": {
            "version": "3.4.1.Final",
            "connector": "postgresql",
            "name": "PostgreSQL_server",
            "ts_ms": 1559033904863,
            "ts_us": 1559033904863123,
            "ts_ns": 1559033904863123000,
            "snapshot": true,
            "db": "postgres",
            "sequence": "[\"24023119\",\"24023128\"]",
            "schema": "public",
            "table": "customers",
            "txId": 555,
            "lsn": 24023128,
            "xmin": null
        },
        "op": "c",
        "ts_ms": 1559033904863,
        "ts_us": 1559033904863841,
        "ts_ns": 1559033904863841257
    }
}

Supermetal Format

{
    "id": 1,
    "first_name": "Anne",
    "last_name": "Kretchmar",
    "email": "[email protected]",
    "_sm_version": 1559033904863841257,
    "_sm_deleted": false
}

Correctness

Supermetal prioritizes correctness above all. Our testing includes end-to-end type fuzzing from source to target database and sqllogictest style test suites, more details published soon.

We run source databases through Supermetal into a target database, then assert the replicated data matches exactly. For Kafka, we run Debezium alongside Supermetal and compare the output. The results are often byte-for-byte identical and always semantically equivalent, accounting for minor serialization differences.

Performance

Supermetal separates compute from IO. Per message serialization to JSON or Avro is CPU intensive. By moving serialization batches to a work-stealing pool, we use all available CPU cores for compute while IO proceeds independently.

A single Rust binary with predictable memory usage. No JVM, no GC pauses, no snapshot locks, no connector state in Kafka topics. Initial syncs that take hours or days with Debezium complete in a fraction of the time.

Snapshot Throughput
10 tables × 10M rows each · Peak 1.9M rows/sec
Duration1:50
Rows108M
CDC Latency Under Load
436B rows · p100 latency <1s up to 15K rows/sec
Row Size436B
p100 Latency<1s
Target Load
WAL Throughput
p100 Latency

Sub-second latency up to 15K rows/sec. Above 20K/sec, single-threaded Postgres logical decoding saturates. The Breakdown view shows Read Latency is 98% of total at peak load.

Getting Started

curl -fsSL https://trial.supermetal.io/install.sh | sh
iwr -useb https://trial.supermetal.io/install.ps1 | iex

Questions? Check out our docs or reach out to us.