Find Unique Characters in a File

前端 未结 22 2290
耶瑟儿~
耶瑟儿~ 2021-02-04 03:30

I have a file with 450,000+ rows of entries. Each entry is about 7 characters in length. What I want to know is the unique characters of this file.

For instance, if my f

22条回答
  •  情书的邮戳
    2021-02-04 04:09

    Well my friend, I think this is what you had in mind....At least this is the python version!!!

    f = open("location.txt", "r") # open file
    
    ll = sorted(list(f.read().lower())) #Read file into memory, split into individual characters, sort list
    ll = [val for idx, val in enumerate(ll) if (idx == 0 or val != ll[idx-1])] # eliminate duplicates
    f.close()
    print "Unique Characters: {%s}" % "".join(ll) #print list of characters, carriage return will throw in a return
    

    It does not iterate through each character, it is relatively short as well. You wouldn't want to open a 500 MB file with it (depending upon your RAM) but for shorter files it is fun :)

    I also have to add my final attack!!!! Admittedly I eliminated two lines by using standard input instead of a file, I also reduced the active code from 3 lines to 2. Basically if I replaced ll in the print line with the expression from the line above it, I could have had 1 line of active code and one line of imports.....Anyway now we are having fun :)

    import itertools, sys
    
    # read standard input into memory, split into characters, eliminate duplicates
    ll = map(lambda x:x[0], itertools.groupby(sorted(list(sys.stdin.read().lower()))))
    print "Unique Characters: {%s}" % "".join(ll) #print list of characters, carriage return will throw in a return
    

提交回复
热议问题