This is a problem I\'m running into and I\'m not quite sure how to approach it.
Say I have a paragraph:
\"This is a test paragraph. I love cats. Please a
First of all, be prepared to accept a certain level of inaccuracy. This may seem simple on the surface, but trying to parse natural languages is an exercise in madness. Let us assume, then, that all sentences are punctuated by .
, ?
, or !
. We can forget about interrobangs and so forth for the moment. Let's also ignore quoted punctuation like "!"
, which doesn't end the sentence.
Also, let's try to grab quotation marks after the punctuation, so that "Foo?"
ends up as "Foo?"
and not "Foo?
.
Finally, for simplicity, let's assume that there are no nested tags inside the paragraph. This is not really a safe assumption, but it will simplify the code, and dealing with nested tags is a separate issue.
$('p').each(function() {
var sentences = $(this)
.text()
.replace(/([^.!?]*[^.!?\s][.!?]['"]?)(\s|$)/g,
'$1$2');
$(this).html(sentences);
});
$('.sentence').on('click', function() {
console.log($(this).text());
});
It's not perfect (for example, quoted punctuation will break it), but it will work 99% of the time.