fastq

Read list of files on unix and run command

别说谁变了你拦得住时间么 提交于 2019-11-29 10:35:07
I am pretty new at shell scripting and I have been struggling all day to figure out how to perform a "for" command. Essentially, what I am trying to do is the following: I have a list.txt file with a bunch of names: name1 name2 name3 for every name in the list, there are two different files, each with a different ending to the name. Ex: name1_R1 name1_R2 The program I am trying to run is called sickle . Basically, it takes two files (that correspond to each other) and runs an analysis on them, hence requiring me to have this naming scheme. The sickle command is as follow: sickle pe -f input

faster membership testing in python than set()

本秂侑毒 提交于 2019-11-29 09:10:11
I have to check presence of millions of elements (20-30 letters str) in the list containing 10-100k of those elements. Is there faster way of doing that in python than set() ? import sys #load ids ids = set( x.strip() for x in open(idfile) ) for line in sys.stdin: id=line.strip() if id in ids: #print fastq print id #update ids ids.remove( id ) set is as fast as it gets. However, if you rewrite your code to create the set once, and not change it, you can use the frozenset built-in type. It's exactly the same except immutable. If you're still having speed problems, you need to speed your program

Read list of files on unix and run command

馋奶兔 提交于 2019-11-28 03:41:21
问题 I am pretty new at shell scripting and I have been struggling all day to figure out how to perform a "for" command. Essentially, what I am trying to do is the following: I have a list.txt file with a bunch of names: name1 name2 name3 for every name in the list, there are two different files, each with a different ending to the name. Ex: name1_R1 name1_R2 The program I am trying to run is called sickle . Basically, it takes two files (that correspond to each other) and runs an analysis on them

faster membership testing in python than set()

房东的猫 提交于 2019-11-28 02:34:35
问题 I have to check presence of millions of elements (20-30 letters str) in the list containing 10-100k of those elements. Is there faster way of doing that in python than set() ? import sys #load ids ids = set( x.strip() for x in open(idfile) ) for line in sys.stdin: id=line.strip() if id in ids: #print fastq print id #update ids ids.remove( id ) 回答1: set is as fast as it gets. However, if you rewrite your code to create the set once, and not change it, you can use the frozenset built-in type.