Importing multi-valued field into Solr from mySQL using Solr Data Import Handler

对着背影说爱祢 提交于 2019-12-18 13:53:18

问题


We have the following two tables in our mySQL:

mysql> describe comment;
+--------------+--------------+------+-----+---------+-------+
| Field        | Type         | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id           | int(11)      | YES  |     | NULL    |       |
| blogpost_id  | int(11)      | YES  |     | NULL    |       |
| comment_text | varchar(256) | YES  |     | NULL    |       |
+--------------+--------------+------+-----+---------+-------+

mysql> describe comment_tags;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| comment_id | int(11)     | YES  |     | NULL    |       |
| tag        | varchar(80) | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+

Where each comment can have multiple tags. We can import the entire comment into Solr using the Data Import Handler. However I am not sure how to import the tags for each comment into a multivalued field defined the schema.xml for each comment document.

Please advise. Thanks


回答1:


Try something like this:

<dataConfig>
    <!-- dataSource is just an example. Included just for completeness. -->
    <dataSource batchSize="500" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/my-database" user="root" password="somethinglong1283"/>
<document>
    <entity name="comment" pk="id" query="SELECT * FROM comment">
        <field column="blogpost_id" name="blogpost_id"/>
        <field column="comment_text" name="comment_text" />
        <entity name="comment_tags" pk="comment_id" query="SELECT * FROM comment_tags WHERE comment_id='${comment.id}'">
            <field column="tag" name="tag" />
        </entity>
    </entity>
</document>




回答2:


You can also use GROUP_CONCAT with a Seperator(e.g " , ") and then try something like this :

<dataConfig>
<!-- dataSource is just an example. Included just for completeness. -->
 <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/db" user="root" password="root"/>
   <document>
     <entity name="comment" pk="id" query="SELECT *, group_concat(tags) as comment_tags FROM comment" transformer="RegexTransformer">
      <field column="blogpost_id" name="blogpost_id"/>
      <field column="comment_text" name="comment_text" />
      <field column="tag" name="comment_tags" splitBy = "," />       
    </entity>
  </document>    
</dataConfig>  

It'll increase the Performance and also will remove the Dependency of another query.




回答3:


If other solution not work then try this one.

<dataConfig>
<!-- dataSource is just an example. Included just for completeness. -->
 <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/db" user="root" password="root"/>
   <document>
     <entity name="comment" pk="id" query="SELECT *, group_concat(tags) as tag FROM comment" transformer="RegexTransformer">
      <field column="blogpost_id" name="blogpost_id"/>
      <field column="comment_text" name="comment_text" /> 
      <field column="tag" splitBy="," sourceColName="tag"/>     
    </entity>
  </document>    
</dataConfig>

Add field in schema.xml

<field name="tag" type="string" indexed="true" stored="true" multiValued="true"/>

If you want to use custom separator in mysql then use below one.

GROUP_CONCAT(tags SEPARATOR '~,~') AS tags

If you want to DISTINCT in concat tag then

GROUP_CONCAT(DISTINCT tags SEPARATOR '~,~') AS tags


来源:https://stackoverflow.com/questions/20233837/importing-multi-valued-field-into-solr-from-mysql-using-solr-data-import-handler

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!