grep | 易学教程

keep fasta records which have 2 matches of OX values

阅读更多关于 keep fasta records which have 2 matches of OX values

问题 I have a file that looks as follows : >sp|rin-1 ghsfdhjkuesl OX=10116 GN=Cdh1 PE=1 SV=1|sp|P10287|ghsfdjdeosd gdhkhs OX=10090 GN=Cdh3 PE=1 SV=2 WRDTANWLEINPETGVISTRAEMDREDSEHVKNSTYTALIIATDDGSPIATGTGTLLLVLSDVNDNAPIPEPRNMQFCQRNPKPHVITILDPDLPP >sp|erin-1 ghsfdshkd OX=10116 GN=Cdh1 PE=1 SV=1|sp|P22223|CADH3_HUMAN Cadherin-3 OX=9606 GN=CDH3 PE=1 SV=2 ESYPTYTLVVQAADLQGEGLSTTAKAVITVKDINDNAPIFNPSTYLQCAASEPCRAVFREAEVTLEAGGAEQEPGQALGKVFMGCPGQEPALFSTD >sp|n-1 ghsfd OX=10116 GN=Cdh1 PE=1 SV=1|tr|F1LMI3

keep fasta records which have 2 matches of OX values

阅读更多关于 keep fasta records which have 2 matches of OX values

keep fasta records which have 2 matches of OX values

阅读更多关于 keep fasta records which have 2 matches of OX values

Search for unicode values in character string

阅读更多关于 Search for unicode values in character string

问题 I am trying to identify unique unicode values in a data frame composed of character strings. I have tried using the grep function, however I encounter the following error Error: '\U' used without hex digits in character string starting ""\U" A example data frame time sender message 1 2012-12-04 13:40:00 1 Hello handsome! 2 2012-12-04 13:40:08 1 \U0001f618 3 2012-12-04 14:39:24 1 \U0001f603 4 2012-12-04 16:04:25 2 <image omitted> 73 2012-12-05 06:02:17 1 Haha not white and blue... White with

Using grep in python

阅读更多关于 Using grep in python

问题 There is a file (query.txt) which has some keywords/phrases which are to be matched with other files using grep. The last three lines of the following code are working perfectly but when the same command is used inside the while loop it goes into an infinite loop or something(ie doesn't respond). import os f=open('query.txt','r') b=f.readline() while b: cmd='grep %s my2.txt'%b #my2 is the file in which we are looking for b os.system(cmd) b=f.readline() f.close() a='He is' cmd='grep %s my2.txt

Using grep in python

阅读更多关于 Using grep in python

What's the difference between [:space:] and [:blank:]?

阅读更多关于 What's the difference between [:space:] and [:blank:]?

问题 From the A Brief Introduction to Regular Expressions [:blank:] matches a space or a tab. [:space:] matches whitespace characters (space and horizontal tab). To me both definitions are the same and I was wondering if they are really duplicates? If they are different, what are the differences? 回答1: For the GNU tools the following from grep.info applies: [:blank:] Blank characters: space and tab. [:space:] Space characters: in the 'C' locale, this is tab, newline, vertical tab, form feed,

How to search for non-ASCII characters with bash tools?

阅读更多关于 How to search for non-ASCII characters with bash tools?

问题 I have a large text file that contains a few unicode characters that make LaTeX crash. How can I find non-ASCII characters in a file with sed, and the like in a Linux bash? 回答1: Try: nonascii() { LANG=C grep --color=always '[^ -~]\+'; } Which can be used like: printf 'ŨTF8\n' | nonascii Within [] ^ means "not". So [^ -~] means characters not between space and ~. So excluding control chars, this matches non ASCII characters, and is a more portable though slightly less accurate version of [^

How to pick multiple fasta sequences from a genes list

阅读更多关于 How to pick multiple fasta sequences from a genes list

问题 I have two files The gene list file looks like this LOC_Os06g12230.1 Pavir.Ab03005 Pavir.J14065 ChrUn.fgenesh Sevir.1G325700 LOC_Os02g51280.1 Bradi3g59320 Brast04G017400 Fasta sequence file looks like this >LOC_Os03g57190.1 pacid=33130570 polypeptide=LOC_Os03g57190.1 locus=LOC_Os03g57190 ID=LOC_Os03g57190.1.MSUv7.0 annot-version=v7.0 ATGGAGGCGGCGGTGGGGGACGGGGAAGGCGGTGGCGGCGGCGGCGGGCGGGGGAAGCGTGGGCGGGGAGGAGGAGGAGG GGAGATGGTGGAGGCGGTGTGGGGGCAGACGGGGAGTACGGCGTCGCGGATCTACAGGGTGAGGGCGACGGGGGGGAAGG

How to pick multiple fasta sequences from a genes list

阅读更多关于 How to pick multiple fasta sequences from a genes list