Is there an (unobtrusive, to the user) way to get all the text in a page with Javascript? I could get the HTML, parse it, remove all tags, etc, but I\'m wondering if there\'
As an improvement to Greg W's answer, you could also remove 'undefined', and remove any numbers, considering they're not the words.
function countWords() {
var collectedText;
$('p,h1,h2,h3,h4,h5').each(function(index, element){
collectedText += element.innerText + " ";
});
// Remove 'undefined if there'
collectedText = collectedText.replace('undefined', '');
// Remove numbers, they're not words
collectedText = collectedText.replace(/[0-9]/g, '');
// Get
console.log("You have " + collectedText.split(' ').length + " in your document.");
return collectedText;
}
This can be split into an array of words, a count of words; whatever, really.