Count number of words in string using JavaScript

自闭症网瘾萝莉.ら 提交于 2019-11-30 19:40:37

You can make a clever use of the replace() method although you are not replacing anything.

var str = "the very long text you have...";

var counter = 0;

// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
   // for each word found increase the counter value by 1
   counter++;
})

alert(counter);

the regex can be improved to exclude html tags for example

You can use split and add a wordcounter to the String prototype:

String.prototype.countWords = function(){
  return this.split(/\s+/).length;
}

'this string has five words'.countWords(); //=> 5

If you want to exclude things like ... or - in a sentence:

String.prototype.countWords = function(){
  return this.split(/\s+\b/).length;
}

'this string has seven ... words  - and counting'.countWords(); //=> 7

I would prefer a RegEx only solution:

var str = "your long string with many words.";
var wordCount = str.match(/(\w+)/g).length;
alert(wordCount); //6

The regex is

\w+    between one and unlimited word characters
/g     greedy - don't stop after the first match

The brackets create a group around every match. So the length of all matched groups should match the word count.

//Count words in a string or what appears as words :-)

        function countWordsString(string){

            var counter = 1;

            // Change multiple spaces for one space
            string=string.replace(/[\s]+/gim, ' ');

            // Lets loop through the string and count the words
            string.replace(/(\s+)/g, function (a) {
               // For each word found increase the counter value by 1
               counter++;
            });

            return counter;
        }


        var numberWords = countWordsString(string);

This is the best solution I've found:

function wordCount(str) { var m = str.match(/[^\s]+/g) return m ? m.length : 0; }

This inverts whitespace selection, which is better than \w+ because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6)

If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' ', ' a\t\t!\r\n#$%() d ' correctly (if you define 'correct' as 0 and 4).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!