How to edit 300 GB text file (genomics data)?

后端 未结 4 1174
旧巷少年郎
旧巷少年郎 2020-12-18 06:34

I have a 300 GB text file that contains genomics data with over 250k records. There are some records with bad data and our genomics program \'Popoolution\' allows us to comm

4条回答
  •  醉梦人生
    2020-12-18 07:25

    The simplest solution is to use a stream-oriented editor such as sed. All you need is to be able to write one or more regular expression(s) that will identify all (and only) the bad records. Since you haven't provided any details on how to identify the bad records, this is the only possible answer.

提交回复
热议问题