问题
I am using Microsoft SharePoint Search (MOSS) to search all pages on a website.
My problem is that when you search for a word that appears in the header, footer, menu or tag cloud section of the website, that word will appear on every page, so the search server will bring you a list of results for that search term: every page on the website.
Ideally I want to tell the search server to ignore certain HTML sections in its search index.
This website seems to describe my problem, and a guy says "why not hide those sections of your website if the User Agent is the search server.
The problem with that approach is that most of the sections I hide contain links to other pages (menu's and tag clouds) and so the crawler will hit a dead end and won't crawl very far.
Anyone got any suggestions on how to solve this problem?
回答1:
I'm not sure if i'm reading this correctly. You DON'T want Search to include parts of your site in the index, but you DO want it to go into that section and follow any links in it?
I think the best way is to indeed exclude those section based on user agent (i.e. add them to a usercontrol and if the user agent is MS Search you don't render the section).
Seeing as these sections would be the same on every page, it's okay to exclude them when the search crawler comes by.
Just create ONE page (i.e. a sitemap :-D). that does include all the links a normal user would see in the footer / header / etc. The crawler could then use that page to follow links deeper into your site. This would be a performance boost as well, seeing as the crawler only encounters the links once instead of on every page.
来源:https://stackoverflow.com/questions/1763239/microsoft-sharepoint-search-ignore-sections-of-the-page