问题
I have a big text string and I am trying to split it into the sentences based on ". ? !". But my regex is not working somehow, can somebody guide me to detect the error?
String str = "When my friend said he likes deep dish pizza one day, I immediately set a time to come back to Little Star. Arguably, the best deep dish pizza in SF...though...I don't believe there are many places that do deep dish pizza. That being said...its not the BEST ever, just the best for the area. They use cornmeal in the crust, or on the baking surface, so there's a bit of extra crunch to it. That being said...I'm not sure how much I like the cornmeal texture to my pizza. I kind of want just a GOOD CRUST, you know? No extra stuff to try to make it more crunchy.";
String[] sentences = str.split("/(?<=[.?!])\\S+(?=[a-z])/i");
But it is not splitting the sentences. Can somebody detect the error?
回答1:
You have wrong regex. Java doesn't understand regex like this PCRE type regex:
/(?<=[.?!])\\S+(?=[a-z])/i
Use this:
String[] sentences = str.split("(?i)(?<=[.?!])\\S+(?=[a-z])");
回答2:
Here's a little tip:
slashes have nothing whatsoever to do with regex
Slashes are an application language artefact of *some+ languages. Java is not one of them.
Try removing the slashes and replacing the trailing "/i" with "(?i)":
String[] sentences = str.split("(?i)(?<=[.?!])\\S+(?=[a-z])");
来源:https://stackoverflow.com/questions/17654738/regex-split-text-document-into-sentences