Debezium from MySQL to Postgres with JDBC Sink - change of transforms.route.replacement gives a SinkRecordField error

问题

I am using this debezium-examples

source.json

{
"name": "inventory-connector",
"config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "tasks.max": "1",
    "database.hostname": "mysql",
    "database.port": "3306",
    "database.user": "debezium",
    "database.password": "dbz",
    "database.server.id": "184054",
    "database.server.name": "dbserver1",
    "database.whitelist": "inventory",
    "database.history.kafka.bootstrap.servers": "kafka:9092",
    "database.history.kafka.topic": "schema-changes.inventory",
    "transforms": "route",
    "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
    "transforms.route.regex": "([^.]+)\\.([^.]+)\\.([^.]+)",
    "transforms.route.replacement": "$3"
}
}

jdbc-sink.json

{
"name": "jdbc-sink",
"config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
    "tasks.max": "1",
    "topics": "customers",
    "connection.url": "jdbc:postgresql://postgres:5432/inventory?user=postgresuser&password=postgrespw",
    "transforms": "unwrap",
    "transforms.unwrap.type": "io.debezium.transforms.UnwrapFromEnvelope",
    "auto.create": "true",
    "insert.mode": "upsert",
    "pk.fields": "id",
    "pk.mode": "record_value"
}
}

its working fine.But when I have made some changes as discuss in the following scenario. it giving me 'SinkRecordField' error.

Scenario

I have changed this properties from source

    "transforms.route.replacement": "my-$2"

now it creating topic in kafka as follow

my-inventory

When I specified topic= my-inventory in jdbc-sink, it giving me the following exception [io.confluent.connect.jdbc.sink.DbStructure]

connect_1    | 2019-01-29 10:34:32,218 INFO   ||  Unable to find fields [SinkRecordField{schema=Schema{STRING}, name='email', isPrimaryKey=false}, SinkRecordField{schema=Schema{STRING}, name='first_name', isPrimaryKey=false}, SinkRecordField{schema=Schema{STRING}, name='last_name', isPrimaryKey=false}] among column names [street, customer_id, city, state, id, type, zip]   [io.confluent.connect.jdbc.sink.DbStructure]
connect_1    | 2019-01-29 10:34:32,220 ERROR  ||  WorkerSinkTask{id=jdbc-sink-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted.   [org.apache.kafka.connect.runtime.WorkerSinkTask]
connect_1    | org.apache.kafka.connect.errors.ConnectException: Cannot ALTER to add missing field SinkRecordField{schema=Schema{STRING}, name='email', isPrimaryKey=false}, as it is not optional and does not have a default value
connect_1    |  at io.confluent.connect.jdbc.sink.DbStructure.amendIfNecessary(DbStructure.java:133)

Note: In Db it create table named 'my-inventory'

回答1:

JDBC sink expects one table per topic, with one single schema (column names x types) per topic as well.

Your regex routing on Debezium/source side is effectively dumping any table (could include some system ones, albeit I don't recall that being a default value in the config) in the inventory database to the "my-inventory" topic.

Therefor, as soon as you'd have more than one table captured in that topic you might run into troubles...

来源：https://stackoverflow.com/questions/54420019/debezium-from-mysql-to-postgres-with-jdbc-sink-change-of-transforms-route-repl

标签

jdbc

apache-kafka

apache-kafka-connect

confluent

debezium