Why Kafka jdbc connect insert data as BLOB instead of varchar

雨燕双飞 提交于 2020-01-06 07:01:18

问题


I am using a Java producer to insert data top my Kafka topic. Then I use Kafka jdbc connect to insert data into my Oracle table. Below is my producer code.

package producer.serialized.avro;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.Properties;


public class Sender4 {

    public static void main(String[] args) {

        String flightSchema = "{\"type\":\"record\"," + "\"name\":\"Flight\","

                + "\"fields\":[{\"name\":\"flight_id\",\"type\":\"string\"},{\"name\":\"flight_to\",\"type\":\"string\"},{\"name\":\"flight_from\",\"type\":\"string\"}]}";                

        Properties props = new Properties();

        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.0.1:9092");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);    
        props.put("schema.registry.url", "http://192.168.0.1:8081");            

        KafkaProducer producer = new KafkaProducer(props);    

        Schema.Parser parser = new Schema.Parser();

        Schema schema = parser.parse(flightSchema);            

        GenericRecord avroRecord = new GenericData.Record(schema);

        avroRecord.put("flight_id", "myflight");
        avroRecord.put("flight_to", "QWE");
        avroRecord.put("flight_from", "RTY");    

        ProducerRecord<String, GenericRecord> record = new ProducerRecord<>("topic9",avroRecord);

        producer.send(record);
    }
}

Below is my Kafka connect properties

name=test-sink-6
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=topic9
connection.url=jdbc:oracle:thin:@192.168.0.1:1521:usera
connection.user=usera
connection.password=usera
auto.create=true
table.name.format=FLIGHTS4
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://192.168.0.1:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://192.168.0.1:8081

From my schema, I am expecting the values inserted to my Oracle table to be varchar2. I have created a table having 3 varchar2 columns. When i started my connector, nothing got inserted. Then i deleted the table and ran the connector with table auto create mode on. That time, the table got created and values got inserted. But the problem is, the column data type is CLOB. I want it to be varchar2 since it use less data.

Why is this happening and how can i fix this? Thank you.


回答1:


Looks like Kafka's String is mapped to Oracle's NCLOB:

<table border="1">
<tr>
<th>Schema Type</th><th>MySQL</th><th>Oracle</th><th>PostgreSQL</th><th>SQLite</th>
</tr>
<tr>
<td>INT8</td><td>TINYINT</td><td>NUMBER(3,0)</td><td>SMALLINT</td><td>NUMERIC</td>
</tr>
<tr>
<td>INT16</td><td>SMALLINT</td><td>NUMBER(5,0)</td><td>SMALLINT</td><td>NUMERIC</td>
</tr>
<tr>
<td>INT32</td><td>INT</td><td>NUMBER(10,0)</td><td>INT</td><td>NUMERIC</td>
</tr>
<tr>
<td>INT64</td><td>BIGINT</td><td>NUMBER(19,0)</td><td>BIGINT</td><td>NUMERIC</td>
</tr>
<tr>
<td>FLOAT32</td><td>FLOAT</td><td>BINARY_FLOAT</td><td>REAL</td><td>REAL</td>
</tr>
<tr>
<td>FLOAT64</td><td>DOUBLE</td><td>BINARY_DOUBLE</td><td>DOUBLE PRECISION</td><td>REAL</td>
</tr>
<tr>
<td>BOOLEAN</td><td>TINYINT</td><td>NUMBER(1,0)</td><td>BOOLEAN</td><td>NUMERIC</td>
</tr>
<tr>
<td>STRING</td><td>VARCHAR(256)</td><td>NCLOB</td><td>TEXT</td><td>TEXT</td>
</tr>
<tr>
<td>BYTES</td><td>VARBINARY(1024)</td><td>BLOB</td><td>BYTEA</td><td>BLOB</td>
</tr>
<tr>
<td>'Decimal'</td><td>DECIMAL(65,s)</td><td>NUMBER(*,s)</td><td>DECIMAL</td><td>NUMERIC</td>
</tr>
<tr>
<td>'Date'</td><td>DATE</td><td>DATE</td><td>DATE</td><td>NUMERIC</td>
</tr>
<tr>
<td>'Time'</td><td>TIME(3)</td><td>DATE</td><td>TIME</td><td>NUMERIC</td>
</tr>
<tr>
<td>'Timestamp'</td><td>TIMESTAMP(3)</td><td>TIMESTAMP</td><td>TIMESTAMP</td><td>NUMERIC</td>
</tr>
</table>

Source: https://www.ibm.com/support/knowledgecenter/en/SSPT3X_4.2.5/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/admin_kafka_jdbc_sink.html

https://docs.confluent.io/current/connect/connect-jdbc/docs/sink_connector.html

UPDATE

OracleDialect class (https://github.com/confluentinc/kafka-connect-jdbc/blob/master/src/main/java/io/confluent/connect/jdbc/sink/dialect/OracleDialect.java) has hardcoded CLOB value and simply extend it with your own class and change that mapping will not help as type of dialect is defined in static method in JdbcSinkTask (https://github.com/confluentinc/kafka-connect-jdbc/blob/master/src/main/java/io/confluent/connect/jdbc/sink/JdbcSinkTask.java)

final DbDialect dbDialect = DbDialect.fromConnectionString(config.connectionUrl);


来源:https://stackoverflow.com/questions/48224386/why-kafka-jdbc-connect-insert-data-as-blob-instead-of-varchar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!