How does Google find relevant content when it\'s parsing the web?
Let\'s say, for instance, Google uses the PHP native DOM Library to parse content. What methods would t
I'd just grab the first 'paragraph' of text. The way most people write stories/problems/whatever is that they first state the most important thing, and then elaborate. If you look at any random text and you can see it makes sense most of the time.
For example, you do it yourself in your original question. If you take the first three sentences of your original question, you have a pretty good summary of what you are trying to do.
And, I just did it myself too: the gist of my comment is summarized in the first paragraph. The rest is just examples and elaborations. If you're not convinced, take a look at a few recent articles I semi-randomly picked from Google News. Ok, that last one was not semi-random, I admit ;)
Anyway, I think that this is a really simple approach that works most of the time. You can always look at meta-descriptions, titles and keywords, but if they aren't there, this might be an option.
Hope this helps.