fulltext index returning no results from pdf filestream

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-12 09:47:07

问题


I have a filestream table running on SQL Server 2012 on a Windows 8.1 x64 machine, which already have a few PDF and TXT files stored, so I decided to create a fulltext index to search through these files by using the following command:

CREATE FULLTEXT CATALOG FileStreamFTSCatalog AS DEFAULT;

CREATE FULLTEXT INDEX ON storage
(FileName Language 1046, File TYPE COLUMN FileExtension Language 1046)
KEY INDEX PK__storage__3214EC077DADCE3C
ON FileStreamFTSCatalog
WITH CHANGE_TRACKING AUTO;

Then I sent these commands after reading some people having the same problem as me:

EXEC sp_fulltext_service @action='load_os_resources', @value=1;
EXEC sp_fulltext_service 'verify_signature', 0;
EXEC sp_fulltext_service 'update_languages';
Exec sp_fulltext_service 'ft_timeout', 600000;
Exec sp_fulltext_service 'ism_size',@value=16;
EXEC sp_fulltext_service 'restart_all_fdhosts';
EXEC sp_help_fulltext_system_components 'filter';
reconfigure with override

I can see the PDF IFilter configured

filter  .pdf    E8978DA6-047F-4E3D-9C78-CDBE46041603    C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll  11.0.1.36   Adobe Systems, Inc.

and I can even do a

select * from storage
where contains(*, 'data')

but it's returning only the TXT files indexed, so I'm wondering: is there anything else I need to do to start indexing my PDFs? Or is it necessary to create another table and reinsert all these PDFs which I already had stored, even though the TXT files are getting indexed justfined?


UPDATE 1:

Opening the SQLFTXXX.LOG I get this message (for the FileTable):

2014-08-20 06:32:09.48 spid29s     Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage_table]' (table or indexed view ID '355584405', database ID '7'), full-text key value '篰磧'. Some columns of the row were not indexed.

And this one (for the FileStream table):

2014-08-19 22:14:50.58 spid20s     Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage]' (table or indexed view ID '674101442', database ID '7'), full-text key value '1797'. Some columns of the row were not indexed.

回答1:


I ran into the same problem. I have a filestream table on SQL Server 2012 Standard populated with PDFs. I downloaded Adobe's iFilter 11 and created a full text index on the PDFs. I was not able to make it work in production--the filestream table was populated, but full text search was not, and this error occurred in the log: (SQL Server Log folder, SQLFTxxxxx.LOG): Warning: No appropriate filter was found during full-text index population for table or indexed view

It turned out that the archive bit on the files was set to on. When I turned it off, the full text search populated and searches started to work.

Hope this helps someone else. Also, if you have insight into why it works this way, please let us know. From researching the archive bit, it appears that it indicates that the file is new or changed and in need of a backup. Thanks!




回答2:


There is another possible fix to this problem; installing some versions of Acrobat or Reader can break the PDF iFilter. Adobe posts this workaround:

https://helpx.adobe.com/acrobat/kb/pdf-search-breaks-110-install.html

Solution

Do one of the following:

Update to Acrobat/Reader 11.0.4 or higher. The issue is fixed in version 11.0.4. PDF iFilter 9 is not supported on Windows 8, update to PDF iFilter 11 from here. If you cannot update your Acrobat/Reader or PDF iFilter, here is the workaround.

Workaround: Restore the registry entry to the Windows 8 native entry as follows:

  1. Go to HKEY_CLASSES_ROOT\.pdf\PersistentHandler. Create the key if it does not exist.

  2. Verify that the value is 1AA9BF05-9A97-48c1-BA28-D9DCE795E93C. If the Acrobat or Reader install overwrote the entry with F6594A6D-D57F-4EFD-B2C3-DCD9779E382E, return it to its original value.

  3. If you have any third-party PDF iFilters installed, reinstall them.

  4. Restart the Windows Search service:

    1. Go to Task Manager > Services.
    2. Select WSearch.
    3. Right-click, and then choose Restart.



回答3:


I had to use Adobe iFilter 9 for sql server 2014 and 2017.

ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip




回答4:


I've finally found a solution, after trying both Adobe and Foxit Ifilter with the same error message, I found this other Ifilter called "PDFlib", I downloaded it and followed its instructions to make it available to SQL Server, rebuilt the index and now my pdfs are indexed and can be searched.

I believe that if I follow these same instructions for the other ifilters they will work as well, gonna try that after I'm done with my tests and update with the results.



来源:https://stackoverflow.com/questions/25388250/fulltext-index-returning-no-results-from-pdf-filestream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!