pandas extract regex allowing mismatches
问题 Pandas has a very fast and nice string method, extract(). This method works perfectly with a regex such as this one: strict_pattern = r"^(?P<pre_spacer>ACGAG)(?P<UMI>.{9,13})(?P<post_spacer>TGGAGTCT)" test_df R1 21 ACGAGTTTTCGTATTTTTGGAGTCTTGTGG 22 ACGAGTAGGGAGGGGGGTGGAGTCTCAGCG 23 ACGAGGGGGGGGAGGCTGGAGTCTCCGGGT 24 ACGAGAATAACGTTTGGTGGAGTCTACCAC 25 ACGAGGGGAATAAATATTGGAGTCTCCTCC 26 ACGAGATTGGGTATGCTGGAGTCTCTGTTC 27 ACGAGGTACCCGCGCCATGGAGTCTCTCTG 28 ACGAGTGGTTTTTGTCGTGGAGTCTCACCA 29