问题
I get differents results, anyone could tell me why?
RegExp:
[0-9]+(?:\s){0,10}(?:\r?\n?)([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})(?:\s){0,10}(?:\r\n|\n|\r){1}(.*\r?\n?.*\r?\n?.*)(?:\n|\r)(?:\n|\r)
On Regex101 I use 'gm' modifiers.
On PHP I use:
preg_match_all($this->Pattern, $txt, $matches, PREG_SET_ORDER);
Regex101 result (look match 4 - this is correct. Pattern get only empty line, without any "time line text"):
MATCH 1
1. [2-4] `00`
2. [5-7] `00`
3. [8-10] `01`
4. [11-14] `163`
5. [19-21] `00`
6. [22-24] `00`
7. [25-27] `05`
8. [28-31] `150`
9. [32-39] `aaaaaaa`
MATCH 2
1. [43-45] `00`
2. [46-48] `00`
3. [49-51] `05`
4. [52-55] `556`
5. [60-62] `00`
6. [63-65] `00`
7. [66-68] `05`
8. [69-72] `921`
9. [73-82] `bbbb
bbbb`
MATCH 3
1. [86-88] `00`
2. [89-91] `00`
3. [92-94] `07`
4. [95-98] `753`
5. [103-105] `00`
6. [106-108] `00`
7. [109-111] `08`
8. [112-115] `168`
9. [116-130] `cccccccccccccc`
MATCH 4
1. [134-136] `00`
2. [137-139] `00`
3. [140-142] `22`
4. [143-146] `854`
5. [151-153] `00`
6. [154-156] `00`
7. [157-159] `28`
8. [160-163] `721`
9. [164-164] ``
MATCH 5
1. [168-170] `00`
2. [171-173] `00`
3. [174-176] `23`
4. [177-180] `336`
5. [185-187] `00`
6. [188-190] `00`
7. [191-193] `31`
8. [194-197] `558`
9. [198-228] `dddddddddddddd
dddddddddddddd
`
MATCH 6
1. [232-234] `00`
2. [235-237] `00`
3. [238-240] `34`
4. [241-244] `228`
5. [249-251] `00`
6. [252-254] `00`
7. [255-257] `36`
8. [258-261] `296`
9. [262-276] `eeeeeeeeeeeeee`
MATCH 7
1. [280-282] `00`
2. [283-285] `00`
3. [286-288] `35`
4. [289-292] `165`
5. [297-299] `00`
6. [300-302] `00`
7. [303-305] `39`
8. [306-309] `785`
9. [310-320] `fffff
ffff`
My Server Results (look at "[3] => Array", pattern gets two "time lines"):
(
[0] => Array
(
[0] => 1
00:00:01,163 --> 00:00:05,150
aaaaaaa
2
[1] => 00
[2] => 00
[3] => 01
[4] => 163
[5] => 00
[6] => 00
[7] => 05
[8] => 150
[9] => aaaaaaa
2
)
[1] => Array
(
[0] => 00:00:05,556 --> 00:00:05,921
bbbb
bbbb
[1] => 0
[2] => 00
[3] => 05
[4] => 556
[5] => 00
[6] => 00
[7] => 05
[8] => 921
[9] => bbbb
bbbb
)
[2] => Array
(
[0] => 3
00:00:07,753 --> 00:00:08,168
cccccccccccccc
4
[1] => 00
[2] => 00
[3] => 07
[4] => 753
[5] => 00
[6] => 00
[7] => 08
[8] => 168
[9] => cccccccccccccc
4
)
[3] => Array
(
[0] => 00:00:22,854 --> 00:00:28,721
5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
[1] => 0
[2] => 00
[3] => 22
[4] => 854
[5] => 00
[6] => 00
[7] => 28
[8] => 721
[9] => 5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
)
[4] => Array
(
[0] => 6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee
7
[1] => 00
[2] => 00
[3] => 34
[4] => 228
[5] => 00
[6] => 00
[7] => 36
[8] => 296
[9] => eeeeeeeeeeeeee
7
)
[5] => Array
(
[0] => 00:00:35,165 --> 00:00:39,785
fffff
ffff
[1] => 0
[2] => 00
[3] => 35
[4] => 165
[5] => 00
[6] => 00
[7] => 39
[8] => 785
[9] => fffff
ffff
)
)
Test String:
1
00:00:01,163 --> 00:00:05,150
aaaaaaa
2
00:00:05,556 --> 00:00:05,921
bbbb
bbbb
3
00:00:07,753 --> 00:00:08,168
cccccccccccccc
4
00:00:22,854 --> 00:00:28,721
5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
dddddddddddddd
6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee
7
00:00:35,165 --> 00:00:39,785
fffff
ffff
回答1:
The reason why this happens is the different line break styles at regex101 (\n
) and in your input (\r\n
).
You can easily solve this by using a unified \R
pattern for any kind of linebreaks.
Note I did not optimize your pattern, I am just showing how to solve the problem stated in the question:
'~[0-9]+\s{0,10}\R?([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})\s{0,10}\R(.*\R?.*\R?.*)\R{2}~'
See the PHP demo
来源:https://stackoverflow.com/questions/39511868/regex101-com-vs-myserver-different-results