问题
I'm getting my feet wet with handling a load of Office and PDF documents with SQL Server 2012's FILETABLE
feature, and using fulltext search on top of that.
I've configured my SQL Server to support fulltext search and filestream, and I've created a FILETABLE
, dumped 800+ documents of all sorts into the folder, and that all works nicely.
In order to be able to fulltext index MS Office documents, I've installed the MS Filter Pack 2.0, and to handle the PDF files, I've downloaded Adobe's iFilter for PDF and installed them all.
Now I've created a full text catalog:
CREATE FULLTEXT CATALOG DocumentCatalog
WITH ACCENT_SENSITIVITY = OFF
and then a full text index on the FILETABLE
table:
CREATE FULLTEXT INDEX
ON dbo.Documents(name, file_type, file_stream)
KEY INDEX [PK_Document]
ON DocumentCatalog
and that all seemed to work just fine. After a while, populating the 800+ documents I have, I can start doing searches:
SELECT
stream_id, name, file_type, cached_file_size,
file_stream.GetFileNamespacePath(1)
FROM
dbo.Documents
WHERE
CONTAINS(*, 'Silverlight')
and stuff that is contained in MS Office documents (*.doc, *.docx, *.ppt, *.pptx, *.xls, *.xlsx
) is found quite nicely - and quickly.
Unfortunately, none of the text in the PDF files seems to be found :-(
Any ideas why? I had no errors during setup, and all seems fine - I can see the .pdf
file type in the Filters
in SQL Server:
SELECT *
FROM sys.fulltext_document_types
returns:
.pdf E8978DA6-047F-4E3D-9C78-CDBE46041603
C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll
11.0.1.36 Adobe Systems, Inc.
but somehow, those PDF don't seem to be indexed. Can I someone find out what files were in fact indexed, and whether or not there was an error during population? Where would I find this information?
回答1:
I had to use Adobe iFilter 9 not 11.
ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip
来源:https://stackoverflow.com/questions/34993405/sql-server-2012-fulltext-search-on-top-of-a-filetable-pdf-not-being-searched