Determining Letter Frequency Of Cipher Text

后端 未结 3 759
我寻月下人不归
我寻月下人不归 2020-12-01 22:38

I am trying to make a tool that finds the frequencies of letters in some type of cipher text. Lets suppose it is all lowercase a-z no numbers. The encoded message is in a t

相关标签:
3条回答
  • 2020-12-01 23:15
    import collections
    
    d = collections.defaultdict(int)
    for c in 'test':
        d[c] += 1
    
    print d # defaultdict(<type 'int'>, {'s': 1, 'e': 1, 't': 2})
    

    From a file:

    myfile = open('test.txt')
    for line in myfile:
        line = line.rstrip('\n')
        for c in line:
            d[c] += 1
    

    For the genius that is the defaultdict container, we must give thanks and praise. Otherwise we'd all be doing something silly like this:

    s = "andnowforsomethingcompletelydifferent"
    d = {}
    for letter in s:
        if letter not in d:
            d[letter] = 1
        else:
            d[letter] += 1
    
    0 讨论(0)
  • 2020-12-01 23:18

    If you want to know the relative frequency of a letter c, you would have to divide number of occurrences of c by the length of the input.

    For instance, taking Adam's example:

    s = "andnowforsomethingcompletelydifferent"
    n = len(s) # n = 37
    

    and storing the absolute frequence of each letter in

    dict[letter]
    

    we obtain the relative frequencies by:

    from string import ascii_lowercase # this is "a...z"
    for c in ascii_lowercase:
        print c, dict[c]/float(n)
    

    putting it all together, we get something like this:

    # get input
    s = "andnowforsomethingcompletelydifferent"
    n = len(s) # n = 37
    
    # get absolute frequencies of letters
    import collections
    dict = collections.defaultdict(int)
    for c in s:
        dict[c] += 1
    
    # print relative frequencies
    from string import ascii_lowercase # this is "a...z"
    for c in ascii_lowercase:
        print c, dict[c]/float(n)
    
    0 讨论(0)
  • 2020-12-01 23:20

    The modern way:

    from collections import Counter
    
    string = "ihavesometextbutidontmindsharing"
    Counter(string)
    #>>> Counter({'i': 4, 't': 4, 'e': 3, 'n': 3, 's': 2, 'h': 2, 'm': 2, 'o': 2, 'a': 2, 'd': 2, 'x': 1, 'r': 1, 'u': 1, 'b': 1, 'v': 1, 'g': 1})
    
    0 讨论(0)
提交回复
热议问题