基于Bboss的Elasticsearch SQL ORM操作

你。 提交于 2020-03-14 18:14:41

1.前言

bboss ES SQL是针对es jdbc的替代解决方案  

bboss 提供一组sql和fetchQuery API,可替代官方es jdbc模块;采用bboss即可拥有bboss的客户端自动发现和容灾能力、对es、jdk、spring boot的兼容性能力,又可以拥有es jdbc的所有功能,同时还解决了因为引入es jdbc导致项目对es版本的强依赖和兼容性问题,

官方的ES-SQL功能必须Elasticsearch 6.3以上的版本才提供;Elasticsearch-SQL插件可以在不同的Elasticsearch版本上运行,可以根据实际情况进行选择。

2.pom.xml

  <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>org.junit.vintage</groupId>
                    <artifactId>junit-vintage-engine</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
        <!--导入db-elasticsearch数据同步依赖包开始-->
        <dependency>
            <groupId>com.bbossgroups.plugins</groupId>
            <artifactId>bboss-elasticsearch-rest-jdbc</artifactId>
            <version>6.0.2</version>
        </dependency>
        <dependency>
            <groupId>com.bbossgroups.plugins</groupId>
            <artifactId>bboss-elasticsearch-spring-boot-starter</artifactId>
            <version>6.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.xerial</groupId>
            <artifactId>sqlite-jdbc</artifactId>
            <version>3.23.1</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.40</version>
        </dependency>
    </dependencies>

3.application.properties

##ES集群配置,支持x-pack和searchguard
#spring.elasticsearch.bboss.elasticUser=elastic
#spring.elasticsearch.bboss.elasticPassword=changeme


spring.elasticsearch.bboss.elasticsearch.rest.hostNames=192.168.1.224:9200
#spring.elasticsearch.bboss.elasticsearch.rest.hostNames=10.180.211.27:9280,10.180.211.27:9281,10.180.211.27:9282
##https配置,添加https://协议头
#spring.elasticsearch.bboss.default.elasticsearch.rest.hostNames=https://10.180.211.27:9280,https://10.180.211.27:9281,https://10.180.211.27:9282
spring.elasticsearch.bboss.elasticsearch.dateFormat=yyyy.MM.dd
spring.elasticsearch.bboss.elasticsearch.timeZone=Asia/Shanghai
#在控制台输出脚本调试开关showTemplate,false关闭,true打开,同时log4j至少是info级别
spring.elasticsearch.bboss.elasticsearch.showTemplate=true
spring.elasticsearch.bboss.elasticsearch.discoverHost=false
# dsl配置文件热加载扫描时间间隔,毫秒为单位,默认5秒扫描一次,<= 0时关闭扫描机制
spring.elasticsearch.bboss.dslfile.refreshInterval = -1


#设置slice scroll查询对应的线程数和等待队列数
spring.elasticsearch.bboss.elasticsearch.sliceScrollThreadCount=100
spring.elasticsearch.bboss.elasticsearch.sliceScrollThreadQueue=100
spring.elasticsearch.bboss.elasticsearch.sliceScrollBlockedWaitTimeout=0

#设置scroll查询对应的线程数和等待队列数
spring.elasticsearch.bboss.elasticsearch.scrollThreadCount=200
spring.elasticsearch.bboss.elasticsearch.scrollThreadQueue=200
spring.elasticsearch.bboss.elasticsearch.scrollBlockedWaitTimeout=0

##es client http连接池配置
spring.elasticsearch.bboss.http.timeoutConnection = 5000
spring.elasticsearch.bboss.http.timeoutSocket = 5000
spring.elasticsearch.bboss.http.connectionRequestTimeout=5000
spring.elasticsearch.bboss.http.retryTime = 1
spring.elasticsearch.bboss.http.retryInterval = 1000
spring.elasticsearch.bboss.http.maxLineLength = -1
spring.elasticsearch.bboss.http.maxHeaderCount = 200
spring.elasticsearch.bboss.http.maxTotal = 400
spring.elasticsearch.bboss.http.defaultMaxPerRoute = 200
spring.elasticsearch.bboss.http.soReuseAddress = false
spring.elasticsearch.bboss.http.soKeepAlive = false
spring.elasticsearch.bboss.http.timeToLive = 3600000
spring.elasticsearch.bboss.http.keepAlive = 3600000
spring.elasticsearch.bboss.http.keystore =
spring.elasticsearch.bboss.http.keyPassword =
# ssl 主机名称校验,是否采用default配置,
# 如果指定为default,就采用DefaultHostnameVerifier,否则采用 SSLConnectionSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER
spring.elasticsearch.bboss.http.hostnameVerifier =

#每隔多少毫秒校验空闲connection,自动释放无效链接
# -1 或者0不检查
spring.elasticsearch.bboss.http.validateAfterInactivity=2000
# 每次获取connection时校验连接,true,校验,false不校验,有性能开销,推荐采用
# validateAfterInactivity来控制连接是否有效
# 默认值false
spring.elasticsearch.bboss.http.staleConnectionCheckEnabled=false
#* 自定义重试控制接口,必须实现接口方法
#* public interface CustomHttpRequestRetryHandler  {
#* 	public boolean retryRequest(IOException exception, int executionCount, HttpContext context,ClientConfiguration configuration);
#* }
#* 方法返回true,进行重试,false不重试
spring.elasticsearch.bboss.http.customHttpRequestRetryHandler=org.frameworkset.spi.remote.http.ConnectionResetHttpRequestRetryHandler

4.实体类

import com.frameworkset.orm.annotation.Column;
import com.frameworkset.orm.annotation.ESId;
import lombok.Data;

import java.util.Date;

   /**
     * 通过column指定索引文档和对象属性的映射关系
     * 通过column注解还可以指定日期格式和时区信息
     * @Column(name="docInfo.author",dataformat = "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",timezone = "Etc/UTC",locale = "zh")
     *
     */
@Data
public class Poems {
    //设定文档标识字段
    @ESId(readSet = true,persistent = false)
    private Integer id;
    private String title;
    private String author;
    private String dynasty;
    private Integer words;
    private String content;
    @Column(name="creat_time")
    private Date creatTime;
    @Column(name="update_time")
    private Date updateTime;

5.执行搜索

5.1官方xpack-sql orm查询

5.1.1queryAll


import com.htkj.bboss.model.Poems;
import org.frameworkset.elasticsearch.ElasticSearchHelper;
import org.frameworkset.elasticsearch.client.ClientInterface;
import org.frameworkset.elasticsearch.entity.ESDatas;
import org.springframework.stereotype.Service;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

@Service
public class SearchService {
    public void queryAll(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        List<Poems> list = clientUtil.sql(Poems.class,"{\"query\": \"SELECT * FROM dbdemo  order by id asc \"}");
        System.out.println(list);
    }

}

输出结果:

5.1.2 queryAuthor

  public void queryAuthor(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        List<Poems> list = clientUtil.sql(Poems.class,"{\"query\": \"SELECT * FROM dbdemo where author like '%李%'order by id asc \"}");
        System.out.println(list);
    }

输出结果:

5.1.3 queryContent

    public void queryContent(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        List<Poems> list = clientUtil.sql(Poems.class,"{\"query\": \" select * from dbdemo where content like '%月%' or(content like '%酒%' and content like'%雨%') order by id asc \"}");
        System.out.println(list);
    }

输出结果:

5.1.4 statistics

    public void  statistics(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        List<Map> list = clientUtil.sql(Map.class,"{\"query\": \" select author, count(*) as author_count from dbdemo group by author \"}");
        System.out.println(list);
    }

输出结果:

5.2第三方插件Elasticsearch-sql查询

使用官方的xpack-sql 进行分页查询,只能fetch_size才能进行分页查询,

而使用fetch_size,效果很不理想,这里使用第三方插件Elasticsearch-sql 进行分页查询

5.2.1Elasticsearch-sql安装

https://github.com/NLPchina/elasticsearch-sql

可以参考上方网站的内容安装es-sql插件

也可以直接将插件下载下来,直接解压到plugin文件目录下

 

 https://github.com/NLPchina/elasticsearch-sql/releases/download/6.8.3.0/elasticsearch-sql-6.8.3.0.zip

基于第三方Elasticsearch-sql插件的查询功能的使用方法和bboss提供的查询api使用方法一致,只是检索的rest服务换成/_sql服务即可。

5.2.2分页查询  queryByPage

    public void queryByPage(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        ESDatas<Poems> esDatas =  //ESDatas包含当前检索的记录集合,最多10条记录,由sql中的limit属性指定
                clientUtil.searchList("/_sql",//sql请求
                        "select * from dbdemo  order by id asc limit 0,2", //elasticsearch-sql支持的sql语句
                        Poems.class);//返回的文档封装对象类型
        //获取结果对象列表
        List<Poems> demos = esDatas.getDatas();
        System.out.println(demos);
        //获取总记录数
        long totalSize = esDatas.getTotalSize();
        System.out.println(totalSize);
        String dsl =  //将sql转换为dsl
                clientUtil.executeHttp("/_sql/_explain",//sql转dsl请求
                        "select * from dbdemo where content like '%月%' or(content like '%酒%' and content like'%雨%')  order by id asc limit 0,2 ",ClientInterface.HTTP_POST);//返回的转换的结果
        System.out.println(dsl);
    }

输出结果:

从dsl语句中可以看出,使用limit进行分页,本质上是使用from size进行分页

5.2.3分页查询  queryAllByPage

这里需要传入两个参数 page 当前页码 pageSize 每页显示的数量

    public void queryAllByPage(int page,int pageSize){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        ESDatas<Poems> esDatas = clientUtil.searchList("/_sql", "select * from dbdemo  order by id asc limit "+(page-1)*pageSize+","+pageSize+"", Poems.class);
        List<Poems> list = esDatas.getDatas();
        long totalSize = esDatas.getTotalSize();
        System.out.println(list);
        System.out.println(totalSize);
    }

以page=2 pageSize=3 为例

输出结果:

5.3 matchQuery

 

有时候需要用到matchquery 进行分词模糊查询

以下是xpack-sql 和es-sql的两种写法

    public void matchQuery(){
        ClientInterface clientUtil = ElasticSearchHelper.getRestClientUtil();
        System.out.println("xpack-sql");
        List<Poems> list = clientUtil.sql(Poems.class,"{\"query\": \"SELECT * FROM dbdemo where match(content,'三千')\"}");
        System.out.println(list);
        String json = clientUtil.executeHttp("/_xpack/sql/translate", "{\"query\": \"SELECT * FROM dbdemo where match(content,'三千') \"}", ClientInterface.HTTP_POST);
        System.out.println(json);
        System.out.println("es-sql");
        ESDatas<Poems> esDatas = clientUtil.searchList("/_sql", "SELECT * FROM dbdemo where content=matchQuery('三千')", Poems.class);
        List<Poems> demos = esDatas.getDatas();
        System.out.println(demos);
        String dsl = clientUtil.executeHttp("/_sql/_explain", "SELECT * FROM dbdemo where content=matchQuery('三千') ",ClientInterface.HTTP_POST);
        System.out.println(dsl);
    }

输出结果:

5.4通过配置文件管理sql语句

5.4.1xpack-sql

    public void queryContentByMapper(){
        ClientInterface clientUtil = ElasticSearchHelper.getConfigRestClientUtil("esmapper/sql.xml");//初始化一个加载sql配置文件的es客户端接口
        //设置sql查询的参数
        Map params = new HashMap();
        params.put("contentField1","月");
        params.put("contentField2","酒");
        params.put("contentField3","雨");
        List<Poems> json = clientUtil.sql(Poems.class,"sqlQuery",params);
        System.out.println(json);
    }

输出结果:

5.4.2sql.xml

<properties>
    <property name="sqlQuery">
        <![CDATA[
         {"query":#"""
         select * from dbdemo where content like
         '%#[contentField1,quoted=false]%'
         or
            (content like
            '%#[contentField2,quoted=false]%'
            and content like
             '%#[contentField3,quoted=false]%')
         order by id asc
         """
         }
        ]]>
</properties>

将esmapper/sql.xml  放到resources目录下面即可

对于一些sql配置说明,官方文档是这么说的:

我们使用#[xxx]类型变量传递sql参数时,如果是字符串内容会自动在值的两边带上双引号,但是在sql语句是字符串值是用单引号'来标识的,所以通过qutoed=false来指示解析引擎不要在值的两边加双引号,然后在外部手动添加单引号:

'#[channelId,quoted=false]'

如果sql语句比较长,可能要换行,es暂时不支持多行sql语句的执行,bboss通过下面特定的语法,来包围多行sql,sql解析引擎在第一次解析sql的时候讲其中的多行sql解析为一行:

#"""

...

...

"""

例如:

{
## 指示sql语句中的回车换行符会被替换掉开始符,注意dsl注释不能放到sql语句中,否则会有问题,因为sql中的回车换行符会被去掉,导致回车换行符后面的语句变道与注释一行
##  导致dsl模板解析的时候部分sql段会被去掉
   "query": #"""
           SELECT * FROM dbclobdemo



               where channelId=#[channelId]
    """,
    ## 指示sql语句中的回车换行符会被替换掉结束符
   "fetch_size": #[fetchSize]
}

5.4.3es-sql

    public void queryContentByMapper2(){
        ClientInterface clientUtil = ElasticSearchHelper.getConfigRestClientUtil("esmapper/sql.xml");
        Map params = new HashMap();
        params.put("contentField1","月");
        params.put("contentField2","酒");
        params.put("contentField3","雨");
        ESDatas<Poems> esDatas = clientUtil.searchList("/_sql", "testESSQL", params, Poems.class);
        List<Poems> list = esDatas.getDatas();
        System.out.println(list);
        long totalSize = esDatas.getTotalSize();
        System.out.println(totalSize);
    }

输出结果:

5.4.4sql.xml

    <property name="testESSQL">
        <![CDATA[
         select * from dbdemo where content like
         '%#[contentField1,quoted=false]%'
         or
            (content like
            '%#[contentField2,quoted=false]%'
            and content like
             '%#[contentField3,quoted=false]%')
         order by id asc
    ]]>
    </property>

可以看到使用es-sql插件编写的xml配置文件的内容,不需要像xpack-sql一样需要用特殊语法来进行包围,

6.一些问题

使用es-sql和xpack-sql均能完成查询,但是从上面的截图可以看到,使用xpack-sql能够完整的返回update_time和create_time字段,

而使用es-sql这是不行的,原因我猜测应该是依赖包的问题,这个依赖包的@column只针对了xpack-sql,

这就导致了es-sql无法识别update_time和create_time字段,想要解决这个问题,最简单的就是名称写的跟index里一样.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!