dataimporthandler

Solr: How distinguish between multiple entities imported through DIH

家住魔仙堡 提交于 2019-12-24 07:06:04
问题 When using DataImportHandler with SqlEntityProcessor, I want to have several definitions going into the same schema with different queries. How can I search both type of entities but also distinguish their source at the same time. Example: <document> <entity name="entity1" query="query1"> <field column="column1" name="column1" /> <field column="column2" name="column2" /> </entity> <entity name="entity2" query="query2"> <field column="column1" name="column1" /> <field column="column2" name=

how to store file path in Solr when using TikaEntityProcessor

狂风中的少年 提交于 2019-12-24 06:42:57
问题 I am using DIH to index local file system. But the file path, size and lastmodified field were not stored. in the schema.xml I defined: <fields> <field name="title" type="string" indexed="true" stored="true"/> <field name="author" type="string" indexed="true" stored="true" /> <!--<field name="text" type="text" indexed="true" stored="true" /> liang added--> <field name="path" type="string" indexed="true" stored="true" /> <field name="size" type="long" indexed="true" stored="true" /> <field

Is it possible to get Solr's DataImportHadler to ignore fields with empty strings?

China☆狼群 提交于 2019-12-24 05:08:26
问题 I am using Solr's DataImportHandler to import data from a database. Some of the records have empty strings if there is no value for that column. Currently the configuration I have produces Solr documents like this: { "x": "value", "y": "", "z": 2 } However I would like to ignore all fields that have no value so that documents like this are created: { "x": "value", "z": 2 } Is there something I can define in the configuration file for the DataImportHandler that will give me my desired results?

Indexing failed. Rolled back all changes. (Solr DataImport)

岁酱吖の 提交于 2019-12-24 03:39:11
问题 When I try to run domain.com:8080/solr/dataimport?command=full-import , I get the error Indexing failed. Rolled back all changes. There's no additional error message to inform me what when wrong? Any suggestions? data-config.xml <dataConfig> <dataSource name="mysql" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/databasename" user="myusername" password="mypassword" /> <document> <entity name="posts" datasource="mysql" query="select id, title, description from posts" deltaQuery=

Solr 4.6.0 DataImportHandler speed up performance

断了今生、忘了曾经 提交于 2019-12-24 00:04:11
问题 I am using Solr 4.6.0 , indexing about 10'000 elements at a time and I suffer bad import performance. That means that importing those 10'000 documents takes about 10 minutes. Of course I know, that this hardly depends on the server hardware, but I still would like to know, how any performance boosts could be done and which of them are actually useful in real-world situations (joins etc.)? I am also very thankful for precise examples and not just links to the official documentation. Here is

Solr Indexing My SQL Timestamp or Date Time field

╄→尐↘猪︶ㄣ 提交于 2019-12-23 01:28:40
问题 To index Date in Solr, Date should be in ISO format. Can we index MySQL Timestamp or Date Time feild with out modifying SQL Select Statement ? I have used <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/> <field name="CreatedDate" type="tdate" indexed="true" stored="true" /> CreatedDate is of Type Date Time in MySQL I am getting following exception 11:23:39,117 WARN [org.apache.solr.handler.dataimport.DateFormatTransformer]

Does Solr data import handler support custom variables?

被刻印的时光 ゝ 提交于 2019-12-22 17:41:03
问题 I currently have an issue with my data import handler where ${dataimporter.last_index_time} is not granular enough to capture two events that happen within a second of each other, leading to issues where a record is skipped over in my database. I am thinking to replace last_index_time with a simple atomically incrementing value as opposed to a datetime, but in order to do that I need to be able to set and read custom variables through solr that can be referenced in my data-config.xml file.

Solr - How to get search result in specific format

放肆的年华 提交于 2019-12-21 20:55:33
问题 While exploring example for indexing wikipedia data in Solr, how can we get the expected result (i.e. same as data imported)? Is there any process that we can achieve it through configurations not from group query, because I have data which having lots of inner tags. I explored xslt result transformation, but i am looking for json response. imported doc: <page> <title>AccessibleComputing</title> <ns>0</ns> <id>10</id> <redirect title="Computer accessibility" /> <revision> <id>381202555</id>

Importing multi-valued field into Solr from mySQL using Solr Data Import Handler

对着背影说爱祢 提交于 2019-12-18 13:53:18
问题 We have the following two tables in our mySQL: mysql> describe comment; +--------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+-------+ | id | int(11) | YES | | NULL | | | blogpost_id | int(11) | YES | | NULL | | | comment_text | varchar(256) | YES | | NULL | | +--------------+--------------+------+-----+---------+-------+ mysql> describe comment_tags; +------------+-------------+----

Solr - How can I receive notifications of failed imports from my DataImportHandler?

自闭症网瘾萝莉.ら 提交于 2019-12-18 08:30:40
问题 Our solr indexes are refreshed according to a schedule, as well as arbitrarily as needed by means of a DataImportHandler full import. We've had several occasions where the import fails for various reasons. How can I receive a notification (preferably email) that an error has occured while performing an import with a DataImportHandler? 回答1: There is no easy config solution. But an alternative exists you might have to do little work. You could register EventListener with DIH in data-config to