Faster way to split a string and count characters using R?

后端未结

关注

 6  740

太阳男子 2021-02-01 08:51

I\'m looking for a faster way to calculate GC content for DNA strings read in from a FASTA file. This boils down to taking a string and counting the number of times that the let

6条回答

囚心锁ツ (楼主)

2021-02-01 09:04
Better to not split at all, just count the matches:
```
gcCount2 <-  function(line, st, sp){
  sum(gregexpr('[GCgc]', substr(line, st, sp))[[1]] > 0)
}
```
That's an order of magnitude faster.

A small C function that just iterates over the characters would be yet another order of magnitude faster.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...