Why does the following result in an array with 7 elements with 5 blank? I\'d expect only 2 elements. Where are the 5 blank elements coming from?
$a = \'OU=RA
String.Split() is character oriented. It splits on O
, U
, =
as three separate places.
Think of it as intending to be used for 1,2,3,4,5
. If you had ,2,3,4,
it would imply there were empty spaces at the start and end. If you had 1,2,,,5
it would imply two empty spaces in the middle.
You can see with something like:
PS C:\> $a = 'OU=RAH,OU=RAC'
PS C:\> $a.Split('RAH')
OU=
,OU=
C
The spaces are R_A_H
and R_A
. Split on the end of a string, it introduces blanks at the start/end.
PowerShell's -split
operator is string oriented.
PS D:\t> $a = 'OU=RAH,OU=RAC'
PS D:\t> $a -split 'OU='
RAH,
RAC
You might do better to split on the comma, then replace out OU=, or vice versa, e.g.
PS D:\t> $a = 'OU=RAH,OU=RAC'
PS D:\t> $a.Replace('OU=','').Split(',')
RAH
RAC
It splits the string for each character in the separator. So its splitting it on 'O', 'U' & '='.
As @mklement0 has commented, my earlier answer would not work in all cases. So here is an alternate way to get the expected items.
$a.Split(',') |% { $_.Split('=') |? { $_ -ne 'OU' } }
This code will split the string, first on ,
then each item will be split on =
and ignore the items that are OU
, eventually returning the expected values:
RAH
RAC
This will work even in case of:
$a = 'OU=FOO,OU=RAH,OU=RAC'
generating 3 items FOO
, RAH
& RAC
To get only 2 string as expected you could use following line:
$a.Split('OU=', [System.StringSplitOptions]::RemoveEmptyEntries)
Which will give output as:
RAH,
RAC
And if you use (note the comma in the separator)
$a.Split(',OU=', [System.StringSplitOptions]::RemoveEmptyEntries)
you will get
RAH
RAC
This is probably what you want. :)
Never mind. Just realised it looks for strings on either side of 'O', 'U', and '='. There are therefore 5 blank chars (in front of the first 'O', between 'O' and 'U', between 'U' and '=', between the second 'O' and 'U', between the second 'U' and '=').
In order to split by strings (rather than a set of characters) and/or regular expressions, use PowerShell's -split
operator:
PS> ('OU=RAH,OU=RAC' -split ',?OU=') -ne '' # parentheses not strictly needed
RAH
RAC
-split
by default interprets its RHS as a regular expression, and ,?OU=
matches both OU
by itself and ,OU
, resulting in the desired splitting, returning the tokens as an array.
-split
, including literal string matching, limiting the number of tokens returned, and use of script blocks, see Get-Help about_split.Since the input starts with a match, however, -split
considers the first element of the split to be the empty string. By passing the resulting array of tokens to -ne ''
, we filter out these empty strings.
By contrast, in Windows PowerShell use of the .NET (FullCLR, up to 4.x) String.Split() method, as you've tried, works very differently:
'OU=RAH,OU=RAC'.Split('OU=')
OU=
is interpreted as an array of characters, any of which, individually acts as separator - irrespective of the order in which the characters are specified. Leading, adjacent, and trailing separators are by default considered to separate empty tokens, so you get an array of 7 tokens:
@( '', '', '', 'RAH,', '', '', 'RAC')
Note to PowerShell Core users (PowerShell versions 6 and above):
The .NET Core String.Split()
method now does have a scalar [string]
overload that looks for an entire string as the separator, which PowerShell Core selects by default; to get the character-array behavior described, you must cast to [char[]]
explicitly:
'OU=RAH,OU=RAC'.Split([char[]] 'OU=')
If you construct the .Split()
method call carefully, you can specify strings, but note that you still don't get regular-expression support:
PS> 'OU=RAH,OU=RAC'.split([string[]] 'OU=', 'RemoveEmptyEntries')
RAH,
RAC
works to split by literal string OU=
, removing empty entries, but as you can see, that doesn't allow you to account for the ,
You can take this further by specifying an array of strings to split by, which works in this simple case, but ultimately doesn't give you the same flexibility as the regular expressions that PowerShell's -split
operator provides:
PS> 'OU=RAH,OU=RAC'.split([string[]] ('OU=', ',OU='), 'RemoveEmptyEntries')
RAH
RAC
Note that specifying an (array of) strings requires the 2-argument form of the method call, meaning you must also specify a System.StringSplitOptions enumeration value. Use 'None'
to not apply any options (as of this writing, the only true option that is supported is 'RemoveEmptyEntries'
, as used above).
(The type-safe way to specify option is to use, e.g., [System.StringSplitOptions]::None
, however, passing the option name as a string is a convenient shortcut; e.g., 'None'
.)