dataimporthandler

solr DIH: RegExTransformer

拜拜、爱过 提交于 2021-01-28 05:12:19
问题 Currently, I need to apply a transformation on bellow third column: ACAC | 0 | 01 ACAC | 0 | 0101 ACAC | 0 | 0102 ACAC | 0 | 010201 I need to transform "010201" to "01/02/01" . So first I need to: trim all ending 0 characters split each 2 numbers and add "/" character. The context of this transformation is inside solr data import handler transformers, but it's using java regex library internally. Is there anyway to get that? I've tried using this regex: Currently, I need to apply a

Static field for document in Data Import Handlerfor Solr

穿精又带淫゛_ 提交于 2020-03-16 07:43:26
问题 Im making an index in solr from db in the following way: <document name="Index"> <entity name="c" query="SELECT * FROM C"> <field column="Name" name="name"/> </entity> <entity name="p" query="SELECT * FROM P"> <field column="Name" name="name"/> </entity> </document> Is it possible to have a static field that is set for each row that signify what type is returned to client so that one can make a call to the right database table based on that information from the json result? That is a field

How to index and search two different tables which are in same datasource using single solr instance Or Solr Template fields not working properly

穿精又带淫゛_ 提交于 2020-01-28 09:20:26
问题 I want to index and search two different entity. File name: db-data-config.xml <dataConfig> <dataSource name="myindex" driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://test-pc:1433;DatabaseName=SampleDB" user="username" password="password" /> <document> <entity name="Employees" query="select * from employee" transformer="TemplateTransformer" dataSource="myindex"> <field column="id" name="singlekey" /> <field column="eId" name="eid" /> <field column="eName" name=

Solr DataImportHandler doesn't work with XML Files

本小妞迷上赌 提交于 2020-01-15 07:24:06
问题 I'm very new to Solr. I succeeded in indexing data from my sql database via DIH. Now I want to import xml files and index them also via DIH but it just won't work! My data-config.xml looks like this: <dataConfig> <dataSource type="FileDataSource" encoding="UTF-8" /> <document> <entity name="dir" processor="FileListEntityProcessor" baseDir="/bla/test2" fileName=".*xml" stream="true" recursive="false" rootEntity="false"> <entity name="PubmedArticle" processor="XPathEntityProcessor" transformer=

Solr DataImportHandler: Can I get a dynamic field name from xml attribute with XPathEntityProcessor?

自古美人都是妖i 提交于 2020-01-14 13:44:13
问题 I have some XML to ingest into Solr, which sounds like a use case that is intended to be solved by the DataImportHandler. What I want to do is pull the column name from one XML attribute and the value from another attribute. Here is an example of what I mean: <document> <data ref="reference.foo"> <value>bar</value> </data> </document> From this xml snippet, I want to add a field with name reference.foo and value bar . The DataImportHandler includes a XPathEntityProcessor for processing XML

SOLR delta-import timestamp issue

半世苍凉 提交于 2020-01-13 06:09:14
问题 I'm new to SOLR and was doing some research on this technology. I now have a question regarding the delta-import function so I looked on SO and found this: Solr DataImportHandler delta import. In the answer there is a field [date_update] mentioned which seems to be a timestamp of the record. My question is: Is [date_update] a timestamp stored in the table on record creation? If so, cannot this create an issues if the date of the Database Server is not exactly in sync with the server on which

DataImportHandler and partial updates

為{幸葍}努か 提交于 2020-01-05 01:41:53
问题 Is that possible to use DataImportHandler with partial updates in Solr 4? Should I be able to use a data-config.xml like the one below, and import both entities in separate moments and get full documents with both data? <document name="item"> <entity name="pricing" query="select * from prc"> <field column="ID" name="itemId" /> <field column="NM" name="itemName" /> <field column="default" name="defaultPrice" /> <field column="sale" name="salesPrice" /> </entity> <entity name="tag" query=

Timestamp compatibility while performing delta import in solr

旧时模样 提交于 2020-01-04 06:35:26
问题 Im new to solr.I have successfully indexed oracle 10g xe database. Im trying to perform delta import on the same. The delta query requires a comparison of last_modified column of the table with ${dih.last_index_time} . However in my application I do not have such a column . Also, i cannot add this column. Therefore i used ' scn_to_timestamp(ora_rowscn) ' to give the value of the required timestamps. This query returns the value of type timestamp in the following format 24-JUL-13 12.42.32

Solr: DIH for multilingual index & multiValued field?

久未见 提交于 2020-01-01 07:27:28
问题 I have a MySQL table: CREATE TABLE documents ( id INT NOT NULL AUTO_INCREMENT, language_code CHAR(2), tags CHAR(30), text TEXT, PRIMARY KEY (id) ); I have 2 questions about Solr DIH: 1) The langauge_code field indicates what language the text field is in. And depending on the language, I want to index text to different Solr fields. # pseudo code if langauge_code == "en": index "text" to Solr field "text_en" elif langauge_code == "fr": index "text" to Solr field "text_fr" elif langauge_code ==

What is the difference between a Join Query and Embedded Entities in Solr DIH?

蹲街弑〆低调 提交于 2019-12-24 10:21:43
问题 I am trying to index data across multiple tables using Solr's Data Import Handler. The official wiki on the DIH suggests using embedded entities to link multiple tables like so: <document> <entity name="item" pk="id" query="SELECT * FROM item"> <entity name="member" pk="memberid" query="SELECT * FROM member WHERE memberid='${item.memberid}'> </entity> </entity> </document> Another way that works is: <document> <entity name="item" pk="id" query="SELECT * FROM item INNER JOIN member ON item