POS-Tagger is incredibly slow

前端 未结 3 1973
忘掉有多难
忘掉有多难 2020-12-10 16:21

I am using nltk to generate n-grams from sentences by first removing given stop words. However, nltk.pos_tag() is extremely slow taking up to 0.6 s

3条回答
  •  醉梦人生
    2020-12-10 16:59

    Use pos_tag_sents for tagging multiple sentences:

    >>> import time
    >>> from nltk.corpus import brown
    >>> from nltk import pos_tag
    >>> from nltk import pos_tag_sents
    >>> sents = brown.sents()[:10]
    >>> start = time.time(); pos_tag(sents[0]); print time.time() - start
    0.934092998505
    >>> start = time.time(); [pos_tag(s) for s in sents]; print time.time() - start
    9.5061340332
    >>> start = time.time(); pos_tag_sents(sents); print time.time() - start 
    0.939551115036
    

提交回复
热议问题