Find the number of characters in a file using Python

后端 未结 16 1315
春和景丽
春和景丽 2021-02-07 00:34

Here is the question:

I have a file with these words:

hey how are you
I am fine and you
Yes I am fine

And it is asked to find the numbe

相关标签:
16条回答
  • 2021-02-07 01:04

    A more Pythonic solution than the others:

    with open('foo.txt') as f:
      text = f.read().splitlines() # list of lines
    
    lines = len(text) # length of the list = number of lines
    words = sum(len(line.split()) for line in text) # split each line on spaces, sum up the lengths of the lists of words
    characters = sum(len(line) for line in text) # sum up the length of each line
    
    print(lines)
    print(words)
    print(characters)
    

    The other answers here are manually doing what str.splitlines() does. There's no reason to reinvent the wheel.

    0 讨论(0)
  • 2021-02-07 01:05

    I found this solution very simply and readable:

    with open("filename", 'r') as file:
        text = file.read().strip().split()
        len_chars = sum(len(word) for word in text)
        print(len_chars)
    
    0 讨论(0)
  • 2021-02-07 01:05

    How's this? It uses a regular expression to match all non-whitespace characters and returns the number of matches within a string.

    import re
    
    DATA="""
    hey how are you
    I am fine and you
    Yes I am fine
    """
    
    def get_char_count(s):
        return len(re.findall(r'\S', s))
    
    if __name__ == '__main__':
        print(get_char_count(DATA))
    

    Output

    35
    

    The image below shows this tested on RegExr:

    0 讨论(0)
  • 2021-02-07 01:08

    Sum up the length of all words in a line:

    characters += sum(len(word) for word in wordslist)
    

    The whole program:

    with open('my_words.txt') as infile:
        lines=0
        words=0
        characters=0
        for line in infile:
            wordslist=line.split()
            lines=lines+1
            words=words+len(wordslist)
            characters += sum(len(word) for word in wordslist)
    print(lines)
    print(words)
    print(characters)
    

    Output:

    3
    13
    35
    

    This:

    (len(word) for word in wordslist)
    

    is a generator expression. It is essentially a loop in one line that produces the length of each word. We feed these lengths directly to sum:

    sum(len(word) for word in wordslist)
    

    Improved version

    This version takes advantage of enumerate, so you save two lines of code, while keeping the readability:

    with open('my_words.txt') as infile:
        words = 0
        characters = 0
        for lineno, line in enumerate(infile, 1):
            wordslist = line.split()
            words += len(wordslist)
            characters += sum(len(word) for word in wordslist)
    
    print(lineno)
    print(words)
    print(characters)
    

    This line:

    with open('my_words.txt') as infile:
    

    opens the file with the promise to close it as soon as you leave indentation. It is always good practice to close file after your are done using it.

    0 讨论(0)
  • 2021-02-07 01:09

    Here is the code:

    fp = open(fname, 'r+').read()
    chars = fp.decode('utf8')
    print len(chars)
    

    Check the output. I just tested it.

    0 讨论(0)
  • 2021-02-07 01:10

    It is probably counting new line characters. Subtract characters with (lines+1)

    0 讨论(0)
提交回复
热议问题