社区投稿 | DBLE和Mycat跨分片查询结果不一致案例分析

落爺英雄遲暮 提交于 2019-12-02 00:40:04

1.背景

某一零售业后端使用了分布式中间件+MySQL数据库作为后端存储。但是因为历史问题存在两种分布式中间件,分别是Mycat和DBLE,共用一组后端MySQL实例。分片规则以及后端数据完全一致。最近碰到了一个比较有意思的场景,财务结算单来往明细和业务来往单据的关联查询。一条跨节点join查询在DBLE、Mycat的查询得到的结果不一致。究竟谁对谁错?

2.DBLE项目介绍

DBLE官方网站: https://opensource.actionsky.com 可以详细了解DBLE的背景和应用场景,本文不涉及到的细节都可在官方文档获得更细节都信息;对于刚了解到同学,可以以本文为快速入门基础

DBLE 官方项目: https://github.com/actiontech/dble 如对源码有兴趣或者需要定制的功能的可以通过源码编译安装

DBLE 下载地址: https://github.com/actiontech/dble/releases

DBLE 官方社区交流:669663113

3.环境准备

在虚拟机搭建类似架构,模拟场景,比较Mycat-DBLE在跨节点join上的异同点。

3.1测试架构

测试环境架构比较简单,DBLE与Mycat共用数据库。

3.2测试软件版本
软件名称 软件版本 端口 管理端口
DBLE DBLE-2.18.10.1-cb392c3-20181106093917 3309 3310
Mycat Mycat-1.6-RELEASE-20161028204710 8066 9066
MySQL 5.7.21-log 3306
3.3表结构

结算单来往明细表

CREATE TABLE `t_bl_detail` (   `unit_num` int(11) DEFAULT NULL,   `tenantid` int(11) DEFAULT NULL,   `detail_num` int(11) DEFAULT NULL,   `balance_date` datetime DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

业务来往单据表

CREATE TABLE `t_bl_super_detail` (   `unit_num` int(11) DEFAULT NULL,   `sup_id` int(11) DEFAULT NULL,   `tenantid` int(11) DEFAULT NULL,   `bl_unit_num` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 
3.4分片规则配置

配置表t_bl_detail、t_bl_super_detail,使用取模算法,数据分布在db1-db4四个database中。

3.41schema配置

DBLE:

   <schema name="testdb" sqlMaxLimit="100" dataNode="dn01">         <table name="t_bl_detail" rule="mod4Series" dataNode="dn01,dn02,dn03,dn04"></table>         <table name="t_bl_super_detail" rule="mod4Series" dataNode="dn01,dn02,dn03,dn04"></table>     </schema>     <dataNode name="dn01" dataHost="group1" database="db1"></dataNode>     <dataNode name="dn02" dataHost="group2" database="db2"></dataNode>     <dataNode name="dn03" dataHost="group1" database="db3"></dataNode>     <dataNode name="dn04" dataHost="group2" database="db4"></dataNode>     <dataHost name="group1" maxCon="100" minCon="5" balance="0" switchType="-1">         <heartbeat>select user()</heartbeat>         <writeHost host="mysql-w9zhkr" url="1.1.1.26:3306" user="ushard" password="VRDkwsUlq++O3HeamsDcuW/2K22si3RIhcTpAdjalbkiinNAeUUDWk11ttls1Z3PXY5TxGJToeK4LsJE3+k0Wg==" id="mysql-w9zhkr" usingDecrypt="1"></writeHost>     </dataHost>     <dataHost name="group2" maxCon="100" minCon="5" balance="0" switchType="-1">         <heartbeat>select user()</heartbeat>         <writeHost host="mysql-ci2va0" url="1.1.1.27:3306" user="ushard" password="O/iCi0F+9ts5YR78ZcQnKzpb5GqG7xSUzJcO44yZ3BhNkT5E7aiZakmz1tbKIQylnqh20vu5Z9egXBUC3GOKow==" id="mysql-ci2va0" usingDecrypt="1"></writeHost>     </dataHost> 

Mycat:

    <schema name="testdb" sqlMaxLimit="100" dataNode="dn01">         <table name="t1" rule="mod4Series" dataNode="dn01,dn02,dn03,dn04"></table>         <table name="t2" rule="mod4Series" dataNode="dn01,dn02,dn03,dn04"></table>     </schema>     <dataNode name="dn01" dataHost="group1" database="db1"></dataNode>     <dataNode name="dn02" dataHost="group2" database="db2"></dataNode>     <dataNode name="dn03" dataHost="group1" database="db3"></dataNode>     <dataNode name="dn04" dataHost="group2" database="db4"></dataNode>     <dataHost name="group1" maxCon="100" minCon="5" balance="0" switchType="-1" dbType="mysql" dbDriver="native">         <heartbeat>select user()</heartbeat>         <writeHost host="mysql-w9zhkr" url="1.1.1.26:3306" user="ushard" password="ushard"></writeHost>     </dataHost>     <dataHost name="group2" maxCon="100" minCon="5" balance="0" switchType="-1" dbType="mysql" dbDriver="native">         <heartbeat>select user()</heartbeat>         <writeHost host="mysql-ci2va0" url="1.1.1.27:3306" user="ushard" password="ushard"></writeHost>     </dataHost> 
3.42rule配置

DBLE:

    <tableRule name="mod4Series">         <rule>             <columns>unit_num</columns>             <algorithm>mod4DB</algorithm>         </rule>     </tableRule>     <function name="mod4DB" class="Hash">         <property name="partitionCount">4</property>         <property name="partitionLength">1</property>     </function> 

Mycat:

    <tableRule name="mod4Series">         <rule>             <columns>id</columns>             <algorithm>mod4DB</algorithm>         </rule>     </tableRule>     <function name="mod4DB" class="io.mycat.route.function.PartitionByMod">         <property name="count">4</property>     </function> 

4.比对开始

4.1准备测试数据

随便登录哪一台中间件写入测试数据

insert into t_bl_detail values(1,3,123443,'2019-01-01 00:00:00'); insert into t_bl_detail values(2,3,3423524,'2019-01-01 00:00:00'); insert into t_bl_detail values(3,3,245245,'2019-01-01 00:00:00'); insert into t_bl_detail values(4,4,356356,'2019-01-01 00:00:00');   insert into t_bl_super_detail values(1,10342,3,2); insert into t_bl_super_detail values(2,12355,3,2); insert into t_bl_super_detail values(3,62542,3,3); insert into t_bl_super_detail values(4,74235,4,1); 

4.2执行跨节点join查询
select  m.unit_num,  n.sup_id,  n.tenantid,  m.detail_num,  n.bl_unit_num,  m.balance_date from  t_bl_detail m join t_bl_super_detail n on  m.tenantid = n.tenantid where  n.tenantid = 3  and m.unit_num = n.bl_unit_num; 

在通过中间件之前,现在MySQL中执行一遍看下结果,作为预期结果供后续案例使用。

分别通过DBLE、mycat执行跨节点join语句

可看到相同的查询语句,DBLE执行结果符合预期,Mycat执行结果缺失。数据差异在于DBLE查询结果相较于Mycat多了跨节点的结果。虽然Mycat执行跨节点join不报错,但是查询结果却和预期不一致。

4.3执行计划

只从结果上判断并没有办法知道是什么原因导致了Mycat结果缺失,查看查询计划,比较两者差异。

4.31DBLE

通过DBLE的执行计划可看出,DBLE内部分别对结算明细表、业务单据表做了各自的数据查询,将查询结果在中间层做了merge。最后获得跨节点join的结果

  • DBLE的执行计划
mysql> explain select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num; +-----------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | DATA_NODE       | TYPE          | SQL/REF                                                                                                                                                              | +-----------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | dn01_0          | BASE SQL      | select `m`.`unit_num`,`m`.`detail_num`,`m`.`balance_date`,`m`.`tenantid` from  `t_bl_detail` `m` where m.tenantid = 3 ORDER BY `m`.`tenantid` ASC,`m`.`unit_num` ASC | | dn02_0          | BASE SQL      | select `m`.`unit_num`,`m`.`detail_num`,`m`.`balance_date`,`m`.`tenantid` from  `t_bl_detail` `m` where m.tenantid = 3 ORDER BY `m`.`tenantid` ASC,`m`.`unit_num` ASC | | dn03_0          | BASE SQL      | select `m`.`unit_num`,`m`.`detail_num`,`m`.`balance_date`,`m`.`tenantid` from  `t_bl_detail` `m` where m.tenantid = 3 ORDER BY `m`.`tenantid` ASC,`m`.`unit_num` ASC | | dn04_0          | BASE SQL      | select `m`.`unit_num`,`m`.`detail_num`,`m`.`balance_date`,`m`.`tenantid` from  `t_bl_detail` `m` where m.tenantid = 3 ORDER BY `m`.`tenantid` ASC,`m`.`unit_num` ASC | | merge_1         | MERGE         | dn01_0; dn02_0; dn03_0; dn04_0                                                                                                                                       | | shuffle_field_1 | SHUFFLE_FIELD | merge_1                                                                                                                                                              | | dn01_1          | BASE SQL      | select `n`.`sup_id`,`n`.`tenantid`,`n`.`bl_unit_num` from  `t_bl_super_detail` `n` where n.tenantid = 3 ORDER BY `n`.`tenantid` ASC,`n`.`bl_unit_num` ASC            | | dn02_1          | BASE SQL      | select `n`.`sup_id`,`n`.`tenantid`,`n`.`bl_unit_num` from  `t_bl_super_detail` `n` where n.tenantid = 3 ORDER BY `n`.`tenantid` ASC,`n`.`bl_unit_num` ASC            | | dn03_1          | BASE SQL      | select `n`.`sup_id`,`n`.`tenantid`,`n`.`bl_unit_num` from  `t_bl_super_detail` `n` where n.tenantid = 3 ORDER BY `n`.`tenantid` ASC,`n`.`bl_unit_num` ASC            | | dn04_1          | BASE SQL      | select `n`.`sup_id`,`n`.`tenantid`,`n`.`bl_unit_num` from  `t_bl_super_detail` `n` where n.tenantid = 3 ORDER BY `n`.`tenantid` ASC,`n`.`bl_unit_num` ASC            | | merge_2         | MERGE         | dn01_1; dn02_1; dn03_1; dn04_1                                                                                                                                       | | shuffle_field_3 | SHUFFLE_FIELD | merge_2                                                                                                                                                              | | join_1          | JOIN          | shuffle_field_1; shuffle_field_3                                                                                                                                     | | shuffle_field_2 | SHUFFLE_FIELD | join_1                                                                                                                                                               | +-----------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 14 rows in set (0.00 sec) 
4.32Mycat

Mycat对于跨节点join的处理则相对暴力,直接将查询语句下发到各个节点,最后将结果进行汇总,如果表连接涉及到跨节点。则跨节点的数据无法进行join。

  • Mycat的执行计划
mysql> explain select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num; +-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | DATA_NODE | SQL                                                                                                                                                                                                    | +-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | dn01      | select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num | | dn02      | select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num | | dn03      | select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num | | dn04      | select m.unit_num,n.sup_id,n.tenantid,m.detail_num,n.bl_unit_num,m.balance_date from t_bl_detail m join t_bl_super_detail n on m.tenantid=n.tenantid where n.tenantid = 3 and m.unit_num=n.bl_unit_num | +-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4 rows in set (0.00 sec) 

5.总结

Mycat是一款非常优秀的分布式中间件,但是在某些细节方面处理的不尽人意。在跨节点关联查询场景下,Mycat采取的策略是直接将语句透传到各个节点上,将获取到的结果整合后返回,得到的结果集和预期结果有出入,缺失了跨节点关联的数据。

DBLE处理跨节点的关联查询是先获取到关联需要的数据,提取到中间件进行融合,得到关联查询的结果并返回。得到的结果集符合预期,与MySQL执行结果一致。可见DBLE在跨节点关联查询方面做了优化,能够提供准确的查询结果。

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!