Capitalization of each sentence in a string in Python 3

后端 未结 5 1136
-上瘾入骨i
-上瘾入骨i 2021-01-22 10:25

This should be easy but somehow I\'m not quite getting it.

My assignment is:

Write a function sentenceCapitalizer that has one parameter of type

相关标签:
5条回答
  • 2021-01-22 10:28

    Just because I couldn't find this solution here.

    You can use 'sent_tokenize' method from nltk.

    import nltk
    string = "hello. my name is Joe. what is your name?"
    sentences = nltk.sent_tokenize(string)
    print (' '.join([s.replace(s[0],s[0].capitalize(),1) for s in sentences]) )
    

    And the output

    Hello. My name is Joe. What is your name?
    
    0 讨论(0)
  • 2021-01-22 10:29

    This does the job. Since it extracts all sentences including their trailing whitespace, this also works if you have multiple paragraphs, where there are line breaks between sentences.

    import re
    
    def sentence_case(text):
        # Split into sentences. Therefore, find all text that ends
        # with punctuation followed by white space or end of string.
        sentences = re.findall('[^.!?]+[.!?](?:\s|\Z)', text)
    
        # Capitalize the first letter of each sentence
        sentences = [x[0].upper() + x[1:] for x in sentences]
    
        # Combine sentences
        return ''.join(sentences)
    

    Here is a working example.

    0 讨论(0)
  • 2021-01-22 10:45

    I did not use 'split' but just while loop instead. Here is my code.

    my_string = input('Enter a string: ')
    new_string = ''
    new_string += my_string[0].upper()
    i = 1
    
    while i < len(my_string)-2:
        new_string += my_string[i]
        if my_string[i] == '.' or my_string[i] == '?' or my_string[i] == '!':
            new_string += ' '
            new_string += my_string[i+2].upper()
            i = i+3
        else:
            if i == len(my_string)-3:
                new_string += my_string[len(my_string)-2:len(my_string)]
            i = i+1
    
    print(new_string)
    

    Here is how it works:

    Enter a string: hello. my name is Joe. what is your name?
    Hello. My name is Joe. What is your name
    
    0 讨论(0)
  • 2021-01-22 10:51

    You are trying to use a string method on the wrong object; words is list object containing strings. Use the method on each individual element instead:

    words2 = [word.capitalize() for word in words]
    

    But this would be applying the wrong transformation; you don't want to capitalise the whole sentence, but just the first letter. str.capitalize() would lowercase everything else, including the J in Joe:

    >>> 'my name is Joe'.capitalize()
    'My name is joe'    
    

    Limit yourself to the first letter only, and then add back the rest of the string unchanged:

    words2 = [word[0].capitalize() + word[1:] for word in words]
    

    Next, a list object has no .join() method either; that too is a string method:

    string2 = '. '.join(words2)
    

    This'll join the strings in words2 with the '. ' (full stop and space) joiner.

    You'll probably want to use better variable names here; your strings are sentences, not words, so your code could do better reflecting that.

    Together that makes your function:

    def sentenceCapitalizer (string1: str):
        sentences = string1.split(". ")
        sentences2 = [sentence[0].capitalize() + sentence[1:] for sentence in sentences]
        string2 = '. '.join(sentences2)
        return string2
    

    Demo:

    >>> def sentenceCapitalizer (string1: str):
    ...     sentences = string1.split(". ")
    ...     sentences2 = [sentence[0].capitalize() + sentence[1:] for sentence in sentences]
    ...     string2 = '. '.join(sentences2)
    ...     return string2
    ... 
    >>> print (sentenceCapitalizer("hello. my name is Joe. what is your name?"))
    Hello. My name is Joe. What is your name?
    
    0 讨论(0)
  • 2021-01-22 10:51

    To allow arbitrary whitespace after the dot. Or to capitalize the full words (It might make the difference for a Unicode text), you could use regular expressions -- re module:

    #!/usr/bin/env python3
    import re
    
    def sentenceCapitalizer(text):
        return re.sub(r"(\.\s+|^)(\w+)",
                      lambda m: m.group(1) + m.group(2).capitalize(),
                      text)
    
    s = "hEllo. my name is Joe. what is your name?"
    print(sentenceCapitalizer(s))
    # -> 'Hello. My name is Joe. What is your name?'
    

    Note: pep8 recommends lowercase names for functions e.g., capitalize_sentence() instead of sentenceCapitalizer().

    To accept a larger variaty of texts, you could use nltk package:

    # $ pip install nltk
    from nltk.tokenize import sent_tokenize, word_tokenize 
    
    def sent_capitalize(sentence):
        """Capitalize the first word in the *sentence*."""
        words = word_tokenize(sentence)
        if words:
           words[0] = words[0].capitalize()
        return " ".join(words[:-1]) + "".join(words[-1:]) # dot
    
    text = "hEllo. my name is Joe. what is your name?"
    # split the text into a list of sentences
    sentences = sent_tokenize(text)
    print(" ".join(map(sent_capitalize, sentences)))
    # -> Hello. My name is Joe. What is your name?
    
    0 讨论(0)
提交回复
热议问题