I was given this formula called FRES (Flesch reading-ease test) that is used to measure the readability of a document:
My task is to write a python function tha
BTW, there's the textstat library.
from textstat.textstat import textstat
from nltk.corpus import gutenberg
for filename in gutenberg.fileids():
print(filename, textstat.flesch_reading_ease(filename))
If you're bent on coding up your own, first you've to
If punctuation is a word and syllables is counted by the regex in your question, then:
import re
from itertools import chain
from nltk.corpus import gutenberg
def num_syllables_per_word(word):
return len(re.findall('[aeiou]+[^aeiou]+', word))
for filename in gutenberg.fileids():
sents = gutenberg.sents(filename)
words = gutenberg.words(filename) # i.e. list(chain(*sents))
num_sents = len(sents)
num_words = len(words)
num_syllables = sum(num_syllables_per_word(w) for w in words)
score = 206.835 - 1.015 * (num_words / num_sents) - 84.6 * (num_syllables / num_words)
print(filename, score)