Keyword analysis in PHP

后端 未结 5 1106
走了就别回头了
走了就别回头了 2021-01-30 03:47

For a web application I\'m building I need to analyze a website, retrieve and rank it\'s most important keywords and display those.

Getting all words, their density and

5条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-30 04:13

    This is probably a small contribution, but I'll mention it nonetheless.

    Context scoring

    To a certain extent you're already looking at the context of a word by using the position in which it's placed. You could add another factor to this by ranking words that appear in a heading (H1, H2, etc.) higher than words inside a paragraph, higher than perhaps words in a bulleted list, etc.

    Frequency sanitization

    Detecting stop words based on a language might work, but perhaps you could consider using a bell curve to determine which word frequencies / densities are too extravagant (e.g. strip lower 5% and upper 95%). Then apply the scoring on the remaining words. Not only does it prevent stop words, but also key word abuse, at least in theory :)

提交回复
热议问题