I would like to convert a single array into a group of smaller arrays, based on a variable. So, 0,1,2,3,4,5,6,7,8,9
would become 0,1,2
when the size is 3.
My current approach:
0..[math]::Round($ids.count/$size) | % {
# slice first elements
$x = $ids[0..($size-1)]
# redefine array w/ remaining values
$ids = $ids[$size..$ids.Length]
# return elements (as an array, which isn't happening)
} | % { "IDS: $($_ -Join ",")" }
IDS: 0
IDS: 1
IDS: 2
IDS: 3
IDS: 4
IDS: 5
IDS: 6
IDS: 7
IDS: 8
IDS: 9
I would like it to be:
IDS: 0,1,2
IDS: 3,4,5
IDS: 6,7,8
IDS: 9
What am I missing?
You can use ,$x
instead of just $x
The about_Operators
section in the documentation has this:
, Comma operator
As a binary operator, the comma creates an array. As a unary
operator, the comma creates an array with one member. Place the
comma before the member.
For the sake of completeness:
function Slice-Array
param (
[Parameter(Mandatory=$true, Position=0, ValueFromPipeline=$True)]
BEGIN { $Items=@()}
foreach ($i in $Item ) { $Items += $i }
0..[math]::Floor($Items.count/$Size) | ForEach-Object {
$x, $Items = $Items[0..($Size-1)], $Items[$Size..$Items.Length]; ,$x
@(0,1,2,3,4,5,6,7,8,9) | Slice-Array -Size 3 | ForEach-Object { "IDs: $($_ -Join ",")" }
Manual Selection:
$ids | Select-Object -First 3 -Skip 0
$ids | Select-Object -First 3 -Skip 3
$ids | Select-Object -First 3 -Skip 6
$ids | Select-Object -First 3 -Skip 9
# Select via looping
$idx = 0
while ($($size * $idx) -lt $ids.Length){
$group = $ids | Select-Object -First $size -skip ($size * $idx)
$group -join ","
$idx ++
To add an explanation to Bill Stewart's effective solution:
Outputting a collection such as an array[1] either implicitly or using return
sends its elements individually through the pipeline; that is, the collection is enumerated (unrolled):
# Count objects received.
PS> (1..3 | Measure-Object).Count
3 # Array elements were sent *individually* through the pipeline.
Using the unary form of , (comma; the array-construction operator) to prevent enumeration is a conveniently concise, though somewhat obscure workaround:
PS> (, (1..3) | Measure-Object).Count
1 # By wrapping the array in a helper array, the original array was preserved.
That is, , <collection>
creates a transient single-element helper array around the original collection so that the enumeration is only applied to the helper array, outputting the enclosed original collection as-is, as a single object.
A conceptually clearer, but more verbose and slower approach is to use Write-Output -NoEnumerate, which clearly signals the intent to output a collection as a single object.
PS> (Write-Output -NoEnumerate (1..3) | Measure-Object).Count
1 # Write-Output -NoEnumerate prevented enumeration.
Pitfall with respect to visual inspection:
On outputting for display, the boundaries between multiple arrays are seemingly erased again:
PS> (1..2), (3..4) # Output two arrays without enumeration
That is, even though two 2-element arrays were each sent as a single object each, the output, through showing elements each on their own line, makes it look like a flat 4-element array was received.
A simple way around that is to stringify each array, which turns each array into a string containing a space-separated list of its elements.
PS> (1..2), (3..4) | ForEach-Object { "$_" }
1 2
3 4
Now it is obvious that two separate arrays were received.
[1] What data types are enumerated:
Instances of data types that implement the IEnumerable
interface are automatically enumerated, but there are exceptions:
Types that also implement IDictionary
, such as hashtables, are not enumerated, and neither are XmlNode
Conversely, instances of DataTable
(which doesn't implement IEnumerable
) are enumerated (as the elements of their .Rows
collection) - see the source code
Additionally, note that stdout output from external program is enumerated line by line.
Craig himself has conveniently wrapped the splitting (partitioning) functionality in a robust function:
Let me offer a better-performing evolution of it (PSv3+ syntax, renamed to Split-Array
), which:
more efficiently collects the input objects using an extensible
collection.doesn't modify the collection during splitting, and instead extracts ranges of elements from it.
function Split-Array {
param (
[Parameter(Mandatory, ValueFromPipeline)]
[String[]] $InputObject
[ValidateRange(1, [int]::MaxValue)]
[int] $Size = 10
begin { $items = New-Object System.Collections.Generic.List[object] }
process { $items.AddRange($InputObject) }
end {
$chunkCount = [Math]::Floor($items.Count / $Size)
foreach ($chunkNdx in 0..($chunkCount-1)) {
, $items.GetRange($chunkNdx * $Size, $Size).ToArray()
if ($chunkCount * $Size -lt $items.Count) {
, $items.GetRange($chunkCount * $Size, $items.Count - $chunkCount * $Size).ToArray()
With small input collections, the optimization won't matter much, but once you get into the thousands of elements, the speed-up can be dramatic:
To give a rough sense of the performance improvement, using Time-Command:
$ids = 0..1e4 # 10,000 numbers
$size = 3 # chunk size
Time-Command { $ids | Split-Array -size $size }, # optimized
{ $ids | Slice-Array -size $size } # original
Sample result from a single-core Windows 10 VM with Windows 5.1 (the absolute times aren't important, but the factors are):
Command Secs (10-run avg.) TimeSpan Factor
------- ------------------ -------- ------
$ids | Split-Array -size $size 0.150 00:00:00.1498207 1.00
$ids | Slice-Array -size $size 10.382 00:00:10.3820590 69.30
Note how the unoptimized function was almost 70 times slower.