optimizing byte-pair encoding

后端 未结 9 982
广开言路
广开言路 2020-12-30 10:45

Noticing that byte-pair encoding (BPE) is sorely lacking from the large text compression benchmark, I very quickly made a trivial literal implementation of

9条回答
  •  囚心锁ツ
    2020-12-30 11:41

    There is an O(n) version of byte-pair encoding which I describe here. I am getting a compression speed of ~200kB/second in Java.

提交回复
热议问题