Most efficient way to store a big DNA sequence?

前端 未结 7 1161
滥情空心
滥情空心 2021-02-04 11:58

I want to pack a giant DNA sequence with an iOS app (about 3,000,000,000 base pairs). Each base pair can have a value A, C, T or G

7条回答
  •  悲&欢浪女
    2021-02-04 12:39

    Use a diff from a reference genome. From the size (3Gbp) that you post, it looks like you want to include a full human sequences. Since sequences don't differ too much from person to person, you should be able to compress massively by storing only a diff.

    Could help a lot. Unless your goal is to store the reference sequence itself. Then you're stuck.

提交回复
热议问题