问题
I want to split like this:
Before:
TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D
After:
TEST_A
TEST_B
TEST_C (with A, B, C)
TEST_D
How can I split it?
回答1:
Regex isn’t going to help this time, so you will have to iterate through the characters.
Fact is, regular expressions aren’t very context-aware. For that reason, you can’t use regular expression to parse HTML. This is why we’re better off iterating through the string ourselves.
function magic_split($str) {
$sets = array(''); // Sets of strings
$set_index = 0; // Remember what index we’re writing to
$brackets_depth = 0; // Keep track if we’re in brackets (or not)
// Iterate through entire string
for($i = 0; $i < strlen($str); $i++) {
// Skip commas if we’re not in brackets
if($brackets_depth < 1 && $str[$i] === ',') continue;
// Add character to current list
$sets[$set_index] .= $str[$i];
// Store brackets depth
if($str[$i] === '(') $brackets_depth++;
if($str[$i] === ')') $brackets_depth--;
if(
$i < strlen($str) - 1 && // Is a next char available?
$str[$i+1] === ',' && // Is it a comma?
$brackets_depth === 0 // Are we not in brackets?
) $sets[++$set_index] = ''; // Add new set
}
return $sets;
}
$input = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
$split = magic_split($input);
回答2:
You want to match:
- a word not containing opening parentheses, nor coma : [^(,]+
- an expression between parenthesis: \([^(]+\)
- or not... and without returning the match, so it becomes: (?:\([^(]+\))?)
- a coma, followed by some space : ,[\s]*
PHP Code:
$ar=preg_split("#([^(,]+(?:\([^(]+\))?),[\s]*#", "$input,", -1,
PREG_SPLIT_DELIM_CAPTURE |PREG_SPLIT_NO_EMPTY)
Edit: it does not work if you don't have coma outside the parenthesis. you'll have to add an extra coma after $input like modified above.
回答3:
The correct solution to this problem will depend on exactly what your specification is for identifying individual elements.
If you expect each one to begin with TEST_
, then you could solve it fairly simply with a regular expression:
$input = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
$matches = preg_split('/,\s*(?=TEST_)/', $input);
var_dump($matches);
Output:
array(4) {
[0]=>
string(6) "TEST_A"
[1]=>
string(6) "TEST_B"
[2]=>
string(21) "TEST_C (with A, B, C)"
[3]=>
string(6) "TEST_D"
}
This splits the string on commas followed by whitespace, using a lookahead assertion test for the presence of TEST_
at the beginning of the next item.
回答4:
You merely need to explode on comma-space and disregard any comma-spaces that are inside of parentheses. (*SKIP)(*FAIL)
will consume all parenthetical expressions and dispose of them so that they are not used as delimiters.
Code: (Demo)
$text = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
var_export(preg_split('~\([^)]*\)(*SKIP)(*FAIL)|, ~', $text));
Output:
array (
0 => 'TEST_A',
1 => 'TEST_B',
2 => 'TEST_C (with A, B, C)',
3 => 'TEST_D',
)
来源:https://stackoverflow.com/questions/28562514/how-can-split-on-a-comma-except-where-it-appears-in-brackets