I want to calculate the frequency of occurrence of each letter in all columns: for example I have this three sequences :
seq1=AATC
seq2=GCCT
seq3=ATCA
<
Here:
sequences = ['AATC',
'GCCT',
'ATCA']
f = zip(*sequences)
counts = [{letter: column.count(letter) for letter in column} for column in f]
print(counts)
Output (reformatted):
[{'A': 2, 'G': 1},
{'A': 1, 'C': 1, 'T': 1},
{'C': 2, 'T': 1},
{'A': 1, 'C': 1, 'T': 1}]
Salient features:
seq1
, seq2
, etc., we put them into a list.*
operator.