Identifying substrings based on complex rules

前端未结

关注

 3  803

Assume I have text strings that look something like this:

A-B-C-I1-I2-D-E-F-I1-I3-D-D-D-D-I1-I1-I2-I1-I1-I3-I3

Here I want to identify sequ

相关标签:

3条回答

闹比i

2021-01-20 02:25

Try the following expression: (.*?)(?:I[0-9]-)*I3(?:-I[0-9])*. See the match groups: https://regex101.com/r/yA6aV9/1

0 讨论(0)
发布评论:

提交评论
- 加载中...

故里飘歌

2021-01-20 02:26

Use strsplit

> x <- "A-B-C-I1-I2-D-E-F-I1-I3-D-D-D-D-I1-I1-I2-I1-I1-I3-I3"
> strsplit(x, "(?:-?I\\d+)*-?\\bI3-?(?:I\\d+-?)*")
[[1]]
[1] "A-B-C-I1-I2-D-E-F" "D-D-D-D"

> strsplit("A-B-I3-C-I3", "(?:-?I\\d+)*-?\\bI3\\b-?(?:I\\d+-?)*")
[[1]]
[1] "A-B" "C"

> strsplit("A-B-I3-C-I3", "(?:-?I\\d+)*-?\\bI3\\b-?(?:I3-?)*")
[[1]]
[1] "A-B" "C"

0 讨论(0)

梦毁少年i

2021-01-20 02:27
You can identify the sequences which contains I3 with following regex :
```
(?:I\\d-?)*I3(?:-?I\\d)*
```
So you can split your text with this regex to get the desire result.

See demo https://regex101.com/r/bJ3iA3/4
0 讨论(0)
发布评论:

提交评论
- 加载中...