Indexing for Sphinx of legacy data generating some errors

可紊 提交于 2019-12-11 09:59:04

问题


A rails 3.2.18 application is being created, migrating data from a rails 2.3.10 application. Data is being ported over via pg_dump and loaded via psql command, without any errors.

One model of the 13 that are indexed via thinking_sphinx is getting some errors. Only 1 in 8.5 docs is being indexed overall.

indexing index 'norm_core'...
ERROR: index 'norm_core': sql_range_query: ERROR:  integer out of range
 (DSN=pgsql://jerdvo:***@localhost:5432/fna_development).
total 1019 docs, 234688 bytes

The index file is

ThinkingSphinx::Index.define :norm, :with => :active_record do
    indexes data
    indexes titolo
    indexes massima
    indexes numero
    indexes norm_fulltext
    indexes region.name, :as => :region
    indexes normtype.name, :as => :normtype

    has region_id
    has normtype_id
    has data, :as => :data_timestamp
end

I'm unsure about the syntax of the last element with data_timestamp, as it could be legacy syntax... It applies to a date field - from schema:

    t.date     "data"

Other models have the same indexing scenario on a date. But none have generated the error.
[assuming that line has to change, should one first doe rake ts:configure before index or rebuild?]


回答1:


Two tips for debugging this:

  • Comment out all of the attributes (the has calls), run the ts:index task, confirm it works. Then introduce each attribute back in one at a time, see which one is causing the error.
  • Check the maximum values of any attribute columns that don't work (e.g. SELECT MAX(data) FROM norms), see if that data is valid and also within the range of a 32-bit unsigned integer.

If it's one of the foreign keys that's ventured into 64-bit int territory, then you can specify that as the data type:

has normtype_id, :type => :bigint

If it's the date column, then you'll need to inform Thinking Sphinx to translate date/time values to be 64-bit integer timestamps by adding the following to each necessary environment in config/thinking_sphinx.yml:

development:
  64bit_timestamps: true

A third source of the issue, I guess, is the primary key being bigger than a 32-bit integer, but TS should detect bigint columns and handle document ids appropriately. Of course, Sphinx also needs to be compiled to handle 64-bit document ids, but I would expect this to be the default (the compile flag, for reference's sake, is --enable-id64).

And if none of that helps... then, well, I'm at a loss to what the cause may be.



来源:https://stackoverflow.com/questions/25304951/indexing-for-sphinx-of-legacy-data-generating-some-errors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!