Check string indentation?

前端 未结 4 1454
一向
一向 2020-12-18 09:00

I\'m building an analyzer for a series of strings. I need to check how much each line is indented (either by tabs or by spaces).

Each line is just a string in a text

相关标签:
4条回答
  • 2020-12-18 09:47

    The len() method will count tab (\t) as one. In some case, it will not behave expectedly. So my way is to use re.sub and then count the space(s).

    indent_count = re.sub(r'^([\s]*)[\s]+.*$', r'\g<1>', line).count(' ')
    
    0 讨论(0)
  • 2020-12-18 09:51
    def count_indentation(line) : 
        count = 0 
        try : 
            while (line[count] == "\t") : 
                count += 1 
            return count
        except : 
            return count
    
    0 讨论(0)
  • 2020-12-18 09:56

    To count the number of spaces at the beginning of a string you could do a comparison between the left stripped (whitespace removed) string and the original:

    a = "    indented string"
    leading_spaces = len(a) - len(a.lstrip())
    print(leading_spaces) 
    # >>> 4
    

    Tab indent is context specific... it changes based on the settings of whatever program is displaying the tab characters. This approach will only tell you the total number of whitespace characters (each tab will be considered one character).

    Or to demonstrate:

    a = "\t\tindented string"
    leading_spaces = len(a) - len(a.lstrip())
    print(leading_spaces)
    # >>> 2
    

    EDIT:

    If you want to do this to a whole file you might want to try

    with open("myfile.txt") as afile:
        line_lengths = [len(line) - len(line.lstrip()) for line in afile]
    
    0 讨论(0)
  • 2020-12-18 10:01

    I think Gizmo's basic idea is good, and it's relatively easy to extend it to handle any mixture of leading tabs and spaces by using a string object's expandtabs() method:

    def indentation(s, tabsize=4):
        sx = s.expandtabs(tabsize)
        return 0 if sx.isspace() else len(sx) - len(sx.lstrip())
    
    print indentation("  tindented string")
    print indentation("\t\tindented string")
    print indentation("  \t  \tindented string")
    

    The last two print statements will output the same value.

    Edit: I modified it to check and return 0 if a line of all tabs and spaces is encountered.

    0 讨论(0)
提交回复
热议问题