I have a string for e.g.
String src = \"How are things today /* this is comment *\\*/ and is your code /*\\* this is another comment */ working?\"
Try this one:
(//[^\n]*$|/(?!\\)\*[\s\S]*?\*(?!\\)/)
If you want to exclude the parts enclused in " " then use:
(\"[^\"]*\"(?!\\))|(//[^\n]*$|/(?!\\)\*[\s\S]*?\*(?!\\)/)
the first capturing group identifies all " " parts and second capturing group gives you comments (both single line and multi line)
copy the regular expression to regex101 if you want explanation
System.out.println(src.replaceAll("\\/\\*.*?\\*\\/ ?", ""));
You have to use the non-greedy-quantifier ? to get the regex working. I also added a ' ?' at the end of the regex to remove one space.
This could be the best approach for multi-line comments
System.out.println(text.replaceAll("\\/\\*[\\s\\S]*?\\*\\/", ""));
Try using this regex (Single line comments only):
String src ="How are things today /* this is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("/\\*.*?\\*/","");//single line comments
System.out.println(result);
REGEX explained:
Match the character "/" literally
Match the character "*" literally
"." Match any single character
"*?" Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Match the character "*" literally
Match the character "/" literally
Alternatively here is regex for single and multi-line comments by adding (?s):
//note the added \n which wont work with previous regex
String src ="How are things today /* this\n is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("(?s)/\\*.*?\\*/","");
System.out.println(result);
Reference:
Try this which worked for me:
System.out.println(src.replaceAll("(\/\*.*?\*\/)+",""));
The best multiline comment regex is an unrolled version of (?s)/\*.*?\*/
that looks like
String pat = "/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/";
See the regex demo and explanation at regex101.com.
In short,
/\*
- match the comment start /*
[^*]*\*+
- match 0+ characters other than *
followed with 1+ literal *
(?:[^/*][^*]*\*+)*
- 0+ sequences of:
[^/*][^*]*\*+
- not a /
or *
(matched with [^/*]
) followed with 0+ non-asterisk characters ([^*]*
) followed with 1+ asterisks (\*+
)/
- closing /
David's regex needs 26 steps to find the match in my example string, and my regex needs just 12 steps. With huge inputs, David's regex is likely to fail with a stack overflow issue or something similar because the .*?
lazy dot matching is inefficient due to lazy pattern expansion at each location the regex engine performs, while my pattern matches linear chunks of text in one go.