Counting bi-gram frequencies

后端未结

关注

 4  464

感情败类 2021-02-06 17:10

I\'ve written a piece of code that essentially counts word frequencies and inserts them into an ARFF file for use with weka. I\'d like to alter it so that it can count bi-gram f

4条回答

借酒劲吻你 (楼主)

2021-02-06 18:15
This should get you started:
```
def bigrams(words):
    wprev = None
    for w in words:
        yield (wprev, w)
        wprev = w
```
Note that the first bigram is (None, w1) where w1 is the first word, so you have a special bigram that marks start-of-text. If you also want an end-of-text bigram, add yield (wprev, None) after the loop.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...