python regex to remove comments

后端 未结 3 2013
野趣味
野趣味 2020-12-06 15:27

How would I write a regex that removes all comments that start with the # and stop at the end of the line -- but at the same time exclude the first two lines which say

相关标签:
3条回答
  • 2020-12-06 15:55
    sed -e '1,2p' -e '/^\s*#/d' infile
    

    Then wrap this in a subprocess.Popen call.

    However, this doesn't substitute a real parser! Why would this be of interest? Well, assume this Python script:

    output = """
    This is
    #1 of 100"""
    

    Boom, any non-parsing solution instantly breaks your script.

    0 讨论(0)
  • 2020-12-06 16:03

    You can remove comments by parsing the Python code with tokenize.generate_tokens. The following is a slightly modified version of this example from the docs:

    import tokenize
    import io
    import sys
    if sys.version_info[0] == 3:
        StringIO = io.StringIO
    else:
        StringIO = io.BytesIO
    
    def nocomment(s):
        result = []
        g = tokenize.generate_tokens(StringIO(s).readline)  
        for toknum, tokval, _, _, _  in g:
            # print(toknum,tokval)
            if toknum != tokenize.COMMENT:
                result.append((toknum, tokval))
        return tokenize.untokenize(result)
    
    with open('script.py','r') as f:
        content=f.read()
    
    print(nocomment(content))
    

    For example:

    If script.py contains

    def foo(): # Remove this comment
        ''' But do not remove this #1 docstring 
        '''
        # Another comment
        pass
    

    then the output of nocomment is

    def foo ():
        ''' But do not remove this #1 docstring 
        '''
    
        pass 
    
    0 讨论(0)
  • 2020-12-06 16:04

    I don't actually think this can be done purely with a regex expression, as you'd need to count quotes to ensure that an instance of # isn't inside of a string.

    I'd look into python's built-in code parsing modules for help with something like this.

    0 讨论(0)
提交回复
热议问题