文章目录
启动Pulsar
- 从官网下载tar.gz安装包
下载地址:https://archive.apache.org/dist/pulsar/pulsar-2.4.0/apache-pulsar-2.4.0-bin.tar.gz
$ wget https://archive.apache.org/dist/pulsar/pulsar-2.4.0/apache-pulsar-2.4.0-bin.tar.gz
- 解压
$ tar zxvf apache-pulsar-2.4.0-bin.tar.gz
- 创建connectors文件夹
$ cd apache-pulsar-2.4.0
$ mkdir connectors
- 启动pulsar
$ bin/pulsar standalone
OR
$ bin/pulsar standalone -nss
创建pulsar-io-debezium-sqlserver-1.0.nar文件
创建maven项目
添加依赖
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-io</artifactId>
<version>2.4.0</version>
</parent>
<artifactId>pulsar-sqlserver</artifactId>
<name>Pulsar</name>
<dependencies>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>pulsar-io-debezium-core</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.debezium</groupId>
<artifactId>debezium-connector-sqlserver</artifactId>
<version>${debezium.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-nar-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
编写DebeziumSqlServerSource类
public class DebeziumSqlServerSource extends DebeziumSource {
static private final String DEFAULT_TASK = "io.debezium.connector.sqlserver.SqlServerConnectorTask";
public void setDbConnectorTask(Map<String, Object> config) throws Exception {
throwExceptionIfConfigNotMatch(config, TaskConfig.TASK_CLASS_CONFIG, DEFAULT_TASK);
}
}
编写debezium-sqlserver-source-config.yaml文件
- 文件内容
tenant: "test"
namespace: "test-namespace"
name: "debezium-sqlserver-source"
topicName: "debezium-sqlserver-topic"
archive: "connectors/pulsar-io-debezium-sqlserver-1.0-SNAPSHOT.nar"
parallelism: 1
configs:
database.hostname: "localhost"
database.port: "1433"
database.user: "sa"
database.password: "Password!"
database.dbname: "testDB"
"database.server.name": "fullfillment"
table.whitelist: "dbo.customers"
pulsar.service.url: "pulsar://127.0.0.1:6650"
database.history.pulsar.topic: "sqlserver-history-topic"
offset.storage.topic: "sqlserver-offset-topic"
- 文件存放路径
编写pulsar-io.yaml文件
- 文件内容
name: debezium-sqlserver
description: Debezium SqlServer Source
sourceClass: com.pulsar.io.debezium.sqlserver.DebeziumSqlServerSource
- 文件存放路径
创建pulsar-io-debezium-sqlserver-1.0.nar文件
- 输入maven打包命令:
mvn clean install
文件打包完成后存放在target目录下。
上传文件至服务器
使用远程连接工具(Xftp或WinSCP)将打包好的nar文件上传至Apache Pulsar安装目录下的connectors目录下。
启动Debezium Connector
编写debezium-sqlserver-source-config.yaml文件
- 文件内容
tenant: "public"
namespace: "default"
name: "debezium-sqlserver-source"
topicName: "debezium-sqlserver-topic"
archive: "connectors/pulsar-io-debezium-sqlserver-1.0-SNAPSHOT.nar" # nar文件路径
parallelism: 1
configs:
database.hostname: "127.0.0.1" # 数据库地址
database.port: "1433" # 数据库端口
database.user: "username" # 用户名
database.password: "password" # 密码
database.server.name: "servername" # 服务名称,可以任意命名
database.dbname: "dbname" # 需要捕获的数据库名城
table.whitelist: "dbo.cm_logs" # 需要捕获的表名称,schemaName.tableName
database.history: "org.apache.pulsar.io.debezium.PulsarDatabaseHistory"
database.history.pulsar.topic: "mssqlserver6-history-topic"
database.history.pulsar.service.url: "pulsar://127.0.0.1:6650"
## KEY_CONVERTER_CLASS_CONFIG, VALUE_CONVERTER_CLASS_CONFIG
key.converter: "org.apache.kafka.connect.json.JsonConverter"
value.converter: "org.apache.kafka.connect.json.JsonConverter"
## PULSAR_SERVICE_URL_CONFIG
pulsar.service.url: "pulsar://127.0.0.1:6650"
## OFFSET_STORAGE_TOPIC_CONFIG
offset.storage.topic: "offset-topic"
configs详细配置信息请参考:Debezium Connector for SQL Server
- 文件存放路劲
启动SQLServer Source 并订阅Topic
启动SQLServer Source
在服务器中输入:
$ bin/pulsar-admin source localrun --source-config-file debezium-mysql-source-config.yaml
订阅Topic
PulsarClient client = PulsarClient.builder().serviceUrl("pulsar://127.0.0.1:6650").build();
Consumer consumer = client.newConsumer(Schema.KeyValue(Schema.BYTES, Schema.BYTES))
.topic("public/default/bigdata.dbo.cm_logs")
.subscriptionName("sub-products")
.subscribe();
while (true) {
try {
Message msg = consumer.receive();
KeyValue<byte[], byte[]> keyValues = Schema.KeyValue(Schema.BYTES, Schema.BYTES).decode(msg.getData());
JSONSchema<Map> jsonSchema = JSONSchema.of(SchemaDefinition.<Map>builder().withPojo(Map.class).build());
Map keyResult = jsonSchema.decode(keyValues.getKey());
Map valueResult = jsonSchema.decode(keyValues.getValue());
// TODO("get data")
System.out.println(keyResult);
System.out.println(valueResult);
consumer.acknowledge(msg);
} catch (Exception e) {
consumer.close();
e.printStackTrace();
}
}
问题回顾
- 问题一
解决方法
开启SQLServer的CDC。(2008及之后的版本需要企业版)
SQL代码如下:
USE [databaseName]
GO
EXEC sys.sp_cdc_enable_db
GO
-- 以下操作为需要捕获哪张表的数据就将表添加到CDC的作业任务中
USE [databaseName]
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'tableName',
@role_name = N'roleName'
GO
详细配置信息请参考:sys.sp_cdc_enable_table (Transact-SQL)
参考文档
Apache Pulsar:http://pulsar.apache.org/docs/en/io-cdc-debezium/
Apache Pulsar Git: https://github.com/apache/pulsar/tree/master/pulsar-io/debezium/mysql
debezium:https://debezium.io/docs/connectors/sqlserver/#setting-up-sqlserver
Microsoft:https://docs.microsoft.com/zh-cn/sql/relational-databases/system-stored-procedures/sys-sp-cdc-enable-table-transact-sql?view=sql-server-2017
来源:CSDN
作者:xiangziO_O
链接:https://blog.csdn.net/xiangziO_O/article/details/96348068