Full text indexing on large files (more than 32k)

风流意气都作罢 提交于 2019-12-06 12:39:01

问题


Is it possible to use Azure Search on blobs over 32kB size? I have around 500GB of text files stored as blobs on Azure. Average blob size is around 1MB. I was so exited to try Azure Search to have full text search on files. However, it looks like index field Edm.String cannot be more than 32kB. I couldn't find this exact limit anywhere, I extracted this information from error message in the portal.

Is there any out of the box solution on Azure that I can use to add full text search functionality on Blobs? Does Azure team plan to remove 32kB field size?


回答1:


Two different limits are potentially relevant here:

  1. Azure Search has a limit on how many characters it will extract from a blob, depending on the pricing tier. For free tier, that limit is 32*1024 characters. For the Standard S1 and S2 pricing tiers, it's 4 million characters.

  2. Separately, there's a limit on the size of a single term in the search index - it also happens to be 32KB. If the content field in your search index is marked as filterable, facetable or sortable then you'll hit this limit (regardless of whether the field is marked as searchable or not). Typically for large searchable content you want to enable searchable and sometimes retrievable but not the rest. That way you won't hit limits on content length from the index side.

We realize that the first limit especially isn't documented now; we'll reflect this in our Quotas and Limits page soon.



来源:https://stackoverflow.com/questions/37149382/full-text-indexing-on-large-files-more-than-32k

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!