I am reading a file that contains data about amino acid sequences for approx. 600000 proteins. for whomever this might be of interest, here the source
I am using