Regular expression to remove comment

后端 未结 5 2013
轮回少年
轮回少年 2021-02-07 14:49

I am trying to write a regular expression which finds all the comments in text. For example all between /* */. Example:

/* Hello */

5条回答
  •  佛祖请我去吃肉
    2021-02-07 15:17

    The right answer - it is impossible. You cannot write a regular expression that would be able to correctly find all comments, or even one type of comments - single-line or multiline.

    Regular expressions can only provide a partial match, one that would would cover perhaps 90% of all cases, but that's it.

    The syntax for regular expression is so complex, it is only possible to identify them correctly in 100% of cases by doing a full expression evaluation, which in turn is based on tokenizing the code. The latter is a huge task, which is implemented by all AST parsers today. See AST Explorer

    Only a proper-written AST parser can tell you precisely where all regular expressions are located in your code. You would have to write a parser then based on that.

    Or, you could use one of the existing libraries that already do all that, like decomment.


    RegEx examples where any head-on approach is going to stumble, being unable to tell a regular expression from a comment block:

    • /\// - it will think this reg-ex is a single-line comment
    • /\/*/ - it will think this reg-ex opens a multi-line comment

提交回复
热议问题