regex101.com VS myserver - different results

微笑、不失礼 提交于 2020-04-30 07:30:29

问题


I get differents results, anyone could tell me why?

RegExp:

[0-9]+(?:\s){0,10}(?:\r?\n?)([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})(?:\s){0,10}(?:\r\n|\n|\r){1}(.*\r?\n?.*\r?\n?.*)(?:\n|\r)(?:\n|\r)

On Regex101 I use 'gm' modifiers.

On PHP I use:

preg_match_all($this->Pattern, $txt, $matches, PREG_SET_ORDER);

Regex101 result (look match 4 - this is correct. Pattern get only empty line, without any "time line text"):

MATCH 1
1.  [2-4]   `00`
2.  [5-7]   `00`
3.  [8-10]  `01`
4.  [11-14] `163`
5.  [19-21] `00`
6.  [22-24] `00`
7.  [25-27] `05`
8.  [28-31] `150`
9.  [32-39] `aaaaaaa`
MATCH 2
1.  [43-45] `00`
2.  [46-48] `00`
3.  [49-51] `05`
4.  [52-55] `556`
5.  [60-62] `00`
6.  [63-65] `00`
7.  [66-68] `05`
8.  [69-72] `921`
9.  [73-82] `bbbb
bbbb`
MATCH 3
1.  [86-88] `00`
2.  [89-91] `00`
3.  [92-94] `07`
4.  [95-98] `753`
5.  [103-105]   `00`
6.  [106-108]   `00`
7.  [109-111]   `08`
8.  [112-115]   `168`
9.  [116-130]   `cccccccccccccc`
MATCH 4
1.  [134-136]   `00`
2.  [137-139]   `00`
3.  [140-142]   `22`
4.  [143-146]   `854`
5.  [151-153]   `00`
6.  [154-156]   `00`
7.  [157-159]   `28`
8.  [160-163]   `721`
9.  [164-164]   ``
MATCH 5
1.  [168-170]   `00`
2.  [171-173]   `00`
3.  [174-176]   `23`
4.  [177-180]   `336`
5.  [185-187]   `00`
6.  [188-190]   `00`
7.  [191-193]   `31`
8.  [194-197]   `558`
9.  [198-228]   `dddddddddddddd
dddddddddddddd
`
MATCH 6
1.  [232-234]   `00`
2.  [235-237]   `00`
3.  [238-240]   `34`
4.  [241-244]   `228`
5.  [249-251]   `00`
6.  [252-254]   `00`
7.  [255-257]   `36`
8.  [258-261]   `296`
9.  [262-276]   `eeeeeeeeeeeeee`
MATCH 7
1.  [280-282]   `00`
2.  [283-285]   `00`
3.  [286-288]   `35`
4.  [289-292]   `165`
5.  [297-299]   `00`
6.  [300-302]   `00`
7.  [303-305]   `39`
8.  [306-309]   `785`
9.  [310-320]   `fffff
ffff`

My Server Results (look at "[3] => Array", pattern gets two "time lines"):

(
    [0] => Array
        (
            [0] => 1
00:00:01,163 --> 00:00:05,150
aaaaaaa

2

            [1] => 00
            [2] => 00
            [3] => 01
            [4] => 163
            [5] => 00
            [6] => 00
            [7] => 05
            [8] => 150
            [9] => aaaaaaa

2
        )

    [1] => Array
        (
            [0] => 00:00:05,556 --> 00:00:05,921
bbbb
bbbb


            [1] => 0
            [2] => 00
            [3] => 05
            [4] => 556
            [5] => 00
            [6] => 00
            [7] => 05
            [8] => 921
            [9] => bbbb
bbbb

        )

    [2] => Array
        (
            [0] => 3
00:00:07,753 --> 00:00:08,168
cccccccccccccc

4

            [1] => 00
            [2] => 00
            [3] => 07
            [4] => 753
            [5] => 00
            [6] => 00
            [7] => 08
            [8] => 168
            [9] => cccccccccccccc

4
        )

    [3] => Array
        (
            [0] => 00:00:22,854 --> 00:00:28,721


5
00:00:23,336 --> 00:00:31,558
dddddddddddddd

            [1] => 0
            [2] => 00
            [3] => 22
            [4] => 854
            [5] => 00
            [6] => 00
            [7] => 28
            [8] => 721
            [9] => 5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
        )

    [4] => Array
        (
            [0] => 6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee

7

            [1] => 00
            [2] => 00
            [3] => 34
            [4] => 228
            [5] => 00
            [6] => 00
            [7] => 36
            [8] => 296
            [9] => eeeeeeeeeeeeee

7
        )

    [5] => Array
        (
            [0] => 00:00:35,165 --> 00:00:39,785
fffff
ffff


            [1] => 0
            [2] => 00
            [3] => 35
            [4] => 165
            [5] => 00
            [6] => 00
            [7] => 39
            [8] => 785
            [9] => fffff
ffff

        )

)

Test String:

1
00:00:01,163 --> 00:00:05,150
aaaaaaa

2
00:00:05,556 --> 00:00:05,921
bbbb
bbbb

3
00:00:07,753 --> 00:00:08,168
cccccccccccccc

4
00:00:22,854 --> 00:00:28,721


5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
dddddddddddddd


6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee

7
00:00:35,165 --> 00:00:39,785
fffff
ffff

回答1:


The reason why this happens is the different line break styles at regex101 (\n) and in your input (\r\n).

You can easily solve this by using a unified \R pattern for any kind of linebreaks.

Note I did not optimize your pattern, I am just showing how to solve the problem stated in the question:

'~[0-9]+\s{0,10}\R?([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})\s{0,10}\R(.*\R?.*\R?.*)\R{2}~'

See the PHP demo



来源:https://stackoverflow.com/questions/39511868/regex101-com-vs-myserver-different-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!