word frequency in javascript

后端 未结 6 1414
自闭症患者
自闭症患者 2020-12-05 16:57

\"enter

How can I implement javascript function to calculate frequency of each word in

相关标签:
6条回答
  • 2020-12-05 17:16

    I feel you have over-complicated things by having multiple arrays, strings, and engaging in frequent (and hard to follow) context-switching between loops, and nested loops.

    Below is the approach I would encourage you to consider taking. I've inlined comments to explain each step along the way. If any of this is unclear, please let me know in the comments and I'll revisit to improve clarity.

    (function () {
    
        /* Below is a regular expression that finds alphanumeric characters
           Next is a string that could easily be replaced with a reference to a form control
           Lastly, we have an array that will hold any words matching our pattern */
        var pattern = /\w+/g,
            string = "I I am am am yes yes.",
            matchedWords = string.match( pattern );
    
        /* The Array.prototype.reduce method assists us in producing a single value from an
           array. In this case, we're going to use it to output an object with results. */
        var counts = matchedWords.reduce(function ( stats, word ) {
    
            /* `stats` is the object that we'll be building up over time.
               `word` is each individual entry in the `matchedWords` array */
            if ( stats.hasOwnProperty( word ) ) {
                /* `stats` already has an entry for the current `word`.
                   As a result, let's increment the count for that `word`. */
                stats[ word ] = stats[ word ] + 1;
            } else {
                /* `stats` does not yet have an entry for the current `word`.
                   As a result, let's add a new entry, and set count to 1. */
                stats[ word ] = 1;
            }
    
            /* Because we are building up `stats` over numerous iterations,
               we need to return it for the next pass to modify it. */
            return stats;
    
        }, {} );
    
        /* Now that `counts` has our object, we can log it. */
        console.log( counts );
    
    }());
    
    0 讨论(0)
  • 2020-12-05 17:16

    I'd go with Sampson's match-reduce method for slightly better efficiency. Here's a modified version of it that is more production-ready. It's not perfect, but it should cover the vast majority of scenarios (i.e., "good enough").

    function calcWordFreq(s) {
      // Normalize
      s = s.toLowerCase();
      // Strip quotes and brackets
      s = s.replace(/["“”(\[{}\])]|\B['‘]([^'’]+)['’]/g, '$1');
      // Strip dashes and ellipses
      s = s.replace(/[‒–—―…]|--|\.\.\./g, ' ');
      // Strip punctuation marks
      s = s.replace(/[!?;:.,]\B/g, '');
      return s.match(/\S+/g).reduce(function(oFreq, sWord) {
        if (oFreq.hasOwnProperty(sWord)) ++oFreq[sWord];
        else oFreq[sWord] = 1;
        return oFreq;
      }, {});
    }
    

    calcWordFreq('A ‘bad’, “BAD” wolf-man...a good ol\' spook -- I\'m frightened!') returns

    {
      "a": 2
      "bad": 2
      "frightened": 1
      "good": 1
      "i'm": 1
      "ol'": 1
      "spook": 1
      "wolf-man": 1
    }
    
    0 讨论(0)
  • 2020-12-05 17:17

    Here is a JavaScript function to get the frequency of each word in a sentence:

    function wordFreq(string) {
        var words = string.replace(/[.]/g, '').split(/\s/);
        var freqMap = {};
        words.forEach(function(w) {
            if (!freqMap[w]) {
                freqMap[w] = 0;
            }
            freqMap[w] += 1;
        });
    
        return freqMap;
    }
    

    It will return a hash of word to word count. So for example, if we run it like so:

    console.log(wordFreq("I am the big the big bull."));
    > Object {I: 1, am: 1, the: 2, big: 2, bull: 1}
    

    You can iterate over the words with Object.keys(result).sort().forEach(result) {...}. So we could hook that up like so:

    var freq = wordFreq("I am the big the big bull.");
    Object.keys(freq).sort().forEach(function(word) {
        console.log("count of " + word + " is " + freq[word]);
    });
    

    Which would output:

    count of I is 1
    count of am is 1
    count of big is 2
    count of bull is 1
    count of the is 2
    

    JSFiddle: http://jsfiddle.net/ah6wsbs6/

    And here is wordFreq function in ES6:

    function wordFreq(string) {
      return string.replace(/[.]/g, '')
        .split(/\s/)
        .reduce((map, word) =>
          Object.assign(map, {
            [word]: (map[word])
              ? map[word] + 1
              : 1,
          }),
          {}
        );
    }
    

    JSFiddle: http://jsfiddle.net/r1Lo79us/

    0 讨论(0)
  • 2020-12-05 17:23

    Here is an updated version of your own code...

    <!DOCTYPE html>
    <html>
    <head>
    <title>string frequency</title>
    <style type="text/css">
    #text{
        width:250px;
    }
    </style>
    </head>
    
    <body >
    
    <textarea id="txt" cols="25" rows="3" placeholder="add your text here">   </textarea></br>
    <button type="button" onclick="search()">search</button>
    
        <script >
    
            function search()
            {
                var data=document.getElementById('txt').value;
                var temp=data;
                var words=new Array();
                words=temp.split(" ");
    
                var unique = {};
    
    
                for (var i = 0; i < words.length; i++) {
                    var word = words[i];
                    console.log(word);
    
                    if (word in unique)
                    {
                        console.log("word found");
                        var count  = unique[word];
                        count ++;
                        unique[word]=count;
                    }
                    else
                    {
                        console.log("word NOT found");
                        unique[word]=1;
                    }
                }
                console.log(unique);
            }
    
        </script>
    
    </body>
    

    I think your loop was overly complicated. Also, trying to produce the final count while still doing your first pass over the array of words is bound to fail because you can't test for uniqueness until you have checked each word in the array.

    Instead of all your counters, I've used a Javascript object to work as an associative array, so we can store each unique word, and the count of how many times it occurs.

    Then, once we exit the loop, we can see the final result.

    Also, this solution uses no regex ;)

    I'll also add that it's very hard to count words just based on spaces. In this code, "one, two, one" will results in "one," and "one" as being different, unique words.

    0 讨论(0)
  • 2020-12-05 17:26

    While both of the answers here are correct maybe are better but none of them address OP's question (what is wrong with the his code).

    The problem with OP's code is here:

    if(f==0){
        count[i]=1;
        uniqueWords[i]=words[i];
    }
    

    On every new word (unique word) the code adds it to uniqueWords at index at which the word was in words. Hence there are gaps in uniqueWords array. This is the reason for some undefined values.

    Try printing uniqueWords. It should give something like:

    ["this", "is", "anil", 4: "kum", 5: "the"]

    Note there no element for index 3.

    Also the printing of final count should be after processing all the words in the words array.

    Here's corrected version:

    function search()
    {
        var data=document.getElementById('txt').value;
        var temp=data;
        var words=new Array();
        words=temp.split(" ");
        var uniqueWords=new Array();
        var count=new Array();
    
    
        for (var i = 0; i < words.length; i++) {
            //var count=0;
            var f=0;
            for(j=0;j<uniqueWords.length;j++){
                if(words[i]==uniqueWords[j]){
                    count[j]=count[j]+1;
                    //uniqueWords[j]=words[i];
                    f=1;
                }
            }
            if(f==0){
                count[i]=1;
                uniqueWords[i]=words[i];
            }
        }
        for ( i = 0; i < uniqueWords.length; i++) {
            if (typeof uniqueWords[i] !== 'undefined')
                console.log("count of "+uniqueWords[i]+" - "+count[i]);       
        }
    }
    

    I have just moved the printing of count out of the processing loop into a new loop and added a if not undefined check.

    Fiddle: https://jsfiddle.net/cdLgaq3a/

    0 讨论(0)
  • 2020-12-05 17:29

    const sentence = 'Hi my friend how are you my friend';
    
    const countWords = (sentence) => {
        const convertToObject = sentence.split(" ").map( (i, k) => {
            return {
              element: {
                  word: i,
                  nr: sentence.split(" ").filter(j => j === i).length + ' occurrence',
              }
    
          }
      });
        return Array.from(new Set(convertToObject.map(JSON.stringify))).map(JSON.parse)
    };
    
    console.log(countWords(sentence));

    0 讨论(0)
提交回复
热议问题