How to add punctuation marks for the sentences?

試著忘記壹切 提交于 2021-02-16 15:39:06


How to approach the problem of building a Punctuation Predictor?

The working demo for the question can be found in this link.

Input Text is as below:

"its   been   a   little   while   Kirk   tells   me its   actually   been
three   weeks   now   that Ive   been   using   this   device   right   here
that   is   of   course   the   Galaxy   S   ten   I mean   Ive   just   been
living   with   this phone   this   has   been   my   phone   has   the   SIM
card   in   it   I   took   photos I   lived   live   I   sent   tweets whatsapp
slack   email   whatever   other   app   this   was my   smart phone"


Predicting punctuation for text (in particular for speech transcriptions) is a well-known problem.

You could try using Punctuator2, either with the provided models or by training new models for text from your domain. Look at the bottom of the README for pointers to some related projects.

Grammarly developed a simpler approach for only inserting periods between run-on sentences, described here:

They did some nice experiments with real vs. artificial training data, which is useful because it's easy to generate training data from texts that you know have reliable punctuation at sentence boundaries, like newspaper text.

