Extract Words from a file

后端 未结 3 618
野的像风
野的像风 2021-01-13 23:03

I open a file using python to find whether a predefined set of words are present in the opened file or not. I took the predefined set of words in a list and opened the file

相关标签:
3条回答
  • 2021-01-13 23:34

    This code will show what words are present in the file, given that the word exactly matches, and is not preceded or followed by punctuation or other characters, and is of the same case. With some minor adjustment, the code could be made more forgiving.

    words = set(['hello', 'world', 'testing'])
    f     = open('testfile.txt', 'rb')
    data  = set(f.read().split())
    print words.intersection(data)
    
    0 讨论(0)
  • 2021-01-13 23:41
    import re
    
    def get_words_from_string(s):
        return set(re.findall(re.compile('\w+'), s.lower()))
    
    def get_words_from_file(fname):
        with open(fname, 'rb') as inf:
            return get_words_from_string(inf.read())
    
    def all_words(needle, haystack):
        return set(needle).issubset(set(haystack))
    
    def any_words(needle, haystack):
        return set(needle).intersection(set(haystack))
    
    search_words = get_words_from_string("This is my test")
    find_in = get_words_from_string("If this were my test, I is passing")
    
    print any_words(search_words, find_in)
    
    print all_words(search_words, find_in)
    

    returns

    set(['this', 'test', 'is', 'my'])
    True
    
    0 讨论(0)
  • 2021-01-13 23:45

    You can do a few things

    • Call file.readlines() and split the entire text on your desired delimiter if your text isn't large
    • Call read() and do it bytes at a time

    Check out the pydocs for file - http://docs.python.org/release/2.5.2/lib/bltin-file-objects.html

    0 讨论(0)
提交回复
热议问题