This should be easy but somehow I\'m not quite getting it.
My assignment is:
Write a function sentenceCapitalizer that has one parameter of type
Just because I couldn't find this solution here.
You can use 'sent_tokenize' method from nltk.
import nltk
string = "hello. my name is Joe. what is your name?"
sentences = nltk.sent_tokenize(string)
print (' '.join([s.replace(s[0],s[0].capitalize(),1) for s in sentences]) )
And the output
Hello. My name is Joe. What is your name?
This does the job. Since it extracts all sentences including their trailing whitespace, this also works if you have multiple paragraphs, where there are line breaks between sentences.
import re
def sentence_case(text):
# Split into sentences. Therefore, find all text that ends
# with punctuation followed by white space or end of string.
sentences = re.findall('[^.!?]+[.!?](?:\s|\Z)', text)
# Capitalize the first letter of each sentence
sentences = [x[0].upper() + x[1:] for x in sentences]
# Combine sentences
return ''.join(sentences)
Here is a working example.
I did not use 'split' but just while loop instead. Here is my code.
my_string = input('Enter a string: ')
new_string = ''
new_string += my_string[0].upper()
i = 1
while i < len(my_string)-2:
new_string += my_string[i]
if my_string[i] == '.' or my_string[i] == '?' or my_string[i] == '!':
new_string += ' '
new_string += my_string[i+2].upper()
i = i+3
else:
if i == len(my_string)-3:
new_string += my_string[len(my_string)-2:len(my_string)]
i = i+1
print(new_string)
Here is how it works:
Enter a string: hello. my name is Joe. what is your name?
Hello. My name is Joe. What is your name
You are trying to use a string method on the wrong object; words
is list object containing strings. Use the method on each individual element instead:
words2 = [word.capitalize() for word in words]
But this would be applying the wrong transformation; you don't want to capitalise the whole sentence, but just the first letter. str.capitalize()
would lowercase everything else, including the J
in Joe
:
>>> 'my name is Joe'.capitalize()
'My name is joe'
Limit yourself to the first letter only, and then add back the rest of the string unchanged:
words2 = [word[0].capitalize() + word[1:] for word in words]
Next, a list object has no .join()
method either; that too is a string method:
string2 = '. '.join(words2)
This'll join the strings in words2
with the '. '
(full stop and space) joiner.
You'll probably want to use better variable names here; your strings are sentences, not words, so your code could do better reflecting that.
Together that makes your function:
def sentenceCapitalizer (string1: str):
sentences = string1.split(". ")
sentences2 = [sentence[0].capitalize() + sentence[1:] for sentence in sentences]
string2 = '. '.join(sentences2)
return string2
Demo:
>>> def sentenceCapitalizer (string1: str):
... sentences = string1.split(". ")
... sentences2 = [sentence[0].capitalize() + sentence[1:] for sentence in sentences]
... string2 = '. '.join(sentences2)
... return string2
...
>>> print (sentenceCapitalizer("hello. my name is Joe. what is your name?"))
Hello. My name is Joe. What is your name?
To allow arbitrary whitespace after the dot. Or to capitalize the full words (It might make the difference for a Unicode text), you could use regular expressions -- re module:
#!/usr/bin/env python3
import re
def sentenceCapitalizer(text):
return re.sub(r"(\.\s+|^)(\w+)",
lambda m: m.group(1) + m.group(2).capitalize(),
text)
s = "hEllo. my name is Joe. what is your name?"
print(sentenceCapitalizer(s))
# -> 'Hello. My name is Joe. What is your name?'
Note: pep8 recommends lowercase names for functions e.g., capitalize_sentence()
instead of sentenceCapitalizer()
.
To accept a larger variaty of texts, you could use nltk package:
# $ pip install nltk
from nltk.tokenize import sent_tokenize, word_tokenize
def sent_capitalize(sentence):
"""Capitalize the first word in the *sentence*."""
words = word_tokenize(sentence)
if words:
words[0] = words[0].capitalize()
return " ".join(words[:-1]) + "".join(words[-1:]) # dot
text = "hEllo. my name is Joe. what is your name?"
# split the text into a list of sentences
sentences = sent_tokenize(text)
print(" ".join(map(sent_capitalize, sentences)))
# -> Hello. My name is Joe. What is your name?