dataimporthandler

Solr 4.1 DataImportHandler ClassNotFoundException

孤街醉人 提交于 2019-12-18 04:02:46
问题 I have been trying to setup Data Import Handler (Solr 4.1) following this tutorial and tried solutions suggested in previous posts such as Configure DIH in multicore solr and added the dataimport jar to the classpath but the error still persists. Any methods to solve this? Here is the entire exception stacktrace: SEVERE: Unable to create core: collection1 org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.SolrCore.<init>(SolrCore.java:794) at org.apache

SolrEntityProcessor is called only once for sub-entities

穿精又带淫゛_ 提交于 2019-12-17 21:11:37
问题 I'm using Solr 4.2, and I am trying to call SolrEntityProcessor as a sub-entity . So far, only one call is made to Solr and a single document is indexed while all others are ignored. This should be possible, but it doesn't seem to work... Any ideas? Code snippist: <document> <entity dataSource="psql" name="user" query="SELECT * FROM users";> <field column="id" name="user_id" /> <entity name="liked_items" processor="SolrEntityProcessor" url="http://localhost:8983/solr/items" query="user_liking

Solr Numeric Overflow

老子叫甜甜 提交于 2019-12-14 02:06:10
问题 I am having problems with SOLR using DataImportHandler, I am making a connection with oracle 10g database and I need import 160 millions records, but when solr reaches around 60 Millions, he throws a exception and breaks the import: java.sql.SQLException: Overflow Numérico at oracle.jdbc.driver.NumberCommonAccessor.throwOverflow(NumberCommonAccessor.java:4381) at oracle.jdbc.driver.NumberCommonAccessor.getBigDecimal(NumberCommonAccessor.java:2509) at oracle.jdbc.driver.NumberCommonAccessor

Solr DataImportHandler configuration

这一生的挚爱 提交于 2019-12-13 12:37:23
问题 I want to get data from mysql database with the help of DataImportHandler so i can create indexes. Now I've configured my Solr instance so that it works on Tomcat (the example admin page), but if I try to change the sorlconfig.xml file i'll get the error message. I'm working with Solr 3.6 So my configuration is: In solrconfig.xml I added: <dataDir>${solr.data.dir:/usr/share/tomcat7/solr2}</dataDir> to specify my working directory and then <requestHandler name="/dataimport" class="org.apache

Solr: FileListEntityProcessor is executing sub entities multiple times

烈酒焚心 提交于 2019-12-13 05:22:09
问题 I have configured a dih-import.xml as shown below. The FileListEntityProcessor walks through some folders and then executes a XPathEntity and a DB-Entity for each file. When I executed a full import for ~30.000 files, the import took almost 3 hours. Back to the DIH-debug console it showed me, that for the first file that was found 2 db-calls were made, for the 2nd 4, then 6, 8, .. google didn't show me anything on this subject, so I am hoping for you :) Thanks in advance <?xml version="1.0"

How to configure solr dataimport handler to parse wikipedia xml document?

房东的猫 提交于 2019-12-13 04:32:21
问题 So this is what I have done so far. I have added a request handler in solrconfig.xml as follows: <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">wiki-data-config.xml</str> </lst> </requestHandler> In the same configuration directory I have created a file wiki-data-config.xml which contains the following, <dataConfig> <dataSource type="FileDataSource" encoding="UTF-8" /> <document> <entity name="page" pk=

Unauthorized dataimport-scheduler calls

泪湿孤枕 提交于 2019-12-12 02:15:41
问题 I'm trying to setup an dataimport-scheduler for solr, everything's working and the deltaimport url is called every 30 minutes, the only problem is I'm using jetty and activated authentication in jetty.xml so the dataimport_scheduler gets: <index update process> Response message Unauthorized (saw in log file), How can I solve this? 回答1: The DataImportScheduler needs to have access to your solr/dataimport url via http. The error you see in the log file is because of the authentication you added

Error: “Missing required field” using embedded entities in Solr's DIH Configuration File

╄→гoц情女王★ 提交于 2019-12-12 00:56:17
问题 I am trying to import multiple tables from a MySQL database using Solr's Data Import Handler (DIH). The DIH does not import data from the second table, 'detail'. My database configuration file is <document> <entity name="item" pk="ListingId" query="SELECT * FROM item as item where listingid=360245270"> <entity name="detail" pk="ListingId" query="SELECT Body FROM detail where listingid='${item.listingid}'"> <field column="Body" name="Body" /> </entity> </entity> </document> I monitored the

Data import in solr from multiple entity

末鹿安然 提交于 2019-12-11 18:17:32
问题 I am trying the Data Import Handler for SQLServer Database. I added the DIhandler in solrconfig.xml , created a data-config.xml according to my database schema and also added a field in the schema.xml which was different. I am connecting with SQLServer database. After I connect and I run the dataimport?command=full-import I am not getting xml tag (data) properly. in my data-config.xml* * *** <document name="Product"> <entity dataSource="ds-1" name="Item" pk="Item_ID" query="select item.Item

delta-import for using multiple table in solr

耗尽温柔 提交于 2019-12-11 14:38:40
问题 I am new user to Solr. When I run full_import command for multiple tables it is working fine. The updated dates are written to dataimport.properties file. when i run delta import it is giving Exception occured while initilizing context.. The query deltaImportQuery and deltaQuery in data-config.xml` is as follows: <dataConfig> <dataSource name="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hellodb" user="root" password="root" batchSize="-1" /> <document>