Improve Powershell Performance to Generate a Random File

后端 未结 4 1180
花落未央
花落未央 2020-12-10 23:09

I\'d like to user Powershell to create a random text file for use in basic system testing (upload, download, checksum, etc). I\'ve used the following articles and co

相关标签:
4条回答
  • 2020-12-10 23:37

    Agree with @dugas that the bottleneck is calling Get-Random for every character.

    You should be able to achieve nearly the same randomness if you increase your character array set, and use the -count property of Get-Random.

    If you have V4, the .foreach method is considerably faster than foreach-object.

    Also traded Out-File for Add-Content, which should also help.

    # select characters from 0-9, A-Z, and a-z
    $chars = [char[]] ([char]'0'..[char]'9' + [char]'A'..[char]'Z' + [char]'a'..[char]'z')
    $chars = $chars * 126
    # write file using 128 byte lines each with 126 random characters
    (1..(1mb/128)).foreach({-join (Get-Random $chars -Count 126) | add-content testfile.txt }) 
    

    That finished in about 32 seconds on my system.

    Edit: Set-Content vs Out-File, using the generated test file:

    $x = Get-Content testfile.txt
    
    (Measure-Command {$x | out-file testfile1.txt}).totalmilliseconds
    (Measure-Command {$x | Set-Content testfile1.txt}).totalmilliseconds
    
    504.0069
    159.0842
    
    0 讨论(0)
  • 2020-12-10 23:37

    Instead of using Get-Random to generate the text as per mjolinor suggestions, I improved the speed by using GUIDs.

    Function New-RandomFile {
        Param(
            $Path = '.', 
            $FileSize = 1kb, 
            $FileName = [guid]::NewGuid().Guid + '.txt'
            ) 
        (1..($FileSize/128)).foreach({-join ([guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-").SubString(1, 126) }) | set-content "$Path\$FileName"
    }
    

    I've ran both versions with Measure-Command. The original code took 1.36 seconds.

    This one took 491 milliseconds. Running:

    New-RandomFile -FileSize 1mb
    

    UPDATE:

    I've updated my function to use a ScriptBlock, so you can replace the 'NewGuid()' method with anything you want.

    In this scenario, I make 1kb chunks, since I know I'm never creating smaller files. This improved the speed of my function drastically!

    Set-Content forces a NewLine at the end, which is why you need to remove 2 Characters each time you write to file. I've replaced it with [io.file]::WriteAllText() instead.

    Function New-RandomFile_1kChunks {
        Param(
            $Path = (Resolve-Path '.').Path, 
            $FileSize = 1kb, 
            $FileName = [guid]::NewGuid().Guid + '.txt'
            ) 
    
        $Chunk = { [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-" }
    
        $Chunks = [math]::Ceiling($FileSize/1kb)
    
        [io.file]::WriteAllText("$Path\$FileName","$(-Join (1..($Chunks)).foreach({ $Chunk.Invoke() }))")
    
        Write-Warning "New-RandomFile: $Path\$FileName"
    
    }
    

    If you dont care that all chunks are random, you can simply Invoke() the generation of the 1kb chunk once.. this improves the speed drastically, but won't make the entire file random.

    Function New-RandomFile_Fast {
        Param(
            $Path = (Resolve-Path '.').Path, 
            $FileSize = 1kb, 
            $FileName = [guid]::NewGuid().Guid + '.txt'
            ) 
    
        $Chunk = { [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
                   [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-" }
        $Chunks = [math]::Ceiling($FileSize/1kb)
        $ChunkString = $Chunk.Invoke()
    
        [io.file]::WriteAllText("$Path\$FileName","$(-Join (1..($Chunks)).foreach({ $ChunkString }))")
    
        Write-Warning "New-RandomFile: $Path\$FileName"
    
    }
    

    Measure-Command all these changes to generate a 10mb file:

    Executing New-RandomFile: 35.7688241 seconds.

    Executing New-RandomFile_1kChunks: 25.1463777 seconds.

    Executing New-RandomFile_Fast: 1.1626236 seconds.

    0 讨论(0)
  • 2020-12-10 23:47

    If you are ok with punctuation you can use this:

    Add-Type -AssemblyName System.Web
    #get a random filename in the present working directory
    $fn = [System.IO.Path]::Combine($pwd, [GUID]::NewGuid().ToString("N") + '.txt')
    #set number of iterations
    $count = 1mb/128
    do{
      #Write the 1267 chars plus eol
      [System.Web.Security.Membership]::GeneratePassword(126,0) | Out-File $fn -Append ascii
      #decrement the counter
      $count--
    }while($count -gt 0)
    

    Which gets you to around 7 seconds. Sample Output:

    0b5rc@EXV|e{kftc+1+Xn$-c%-*9q_9L}p=I=k@zrDg@HaJDcl}B(38i&m{lV@vlq%5h/a?m2X!yo]qs0=pEw:Tn4wb5F$k$O85$8F.QLvUzA{@X2-w%5(3k;BE2Qi
    

    Using a stream writer instead of Out-File -Append avoids the open/close cycles and drops the same to 62 milliseconds.

    Add-Type -AssemblyName System.Web
    #get a random filename in the present working directory
    $fn = [System.IO.Path]::Combine($pwd, [GUID]::NewGuid().ToString("N") + '.txt')
    #set number of iterations
    $count = 1mb/128
    #create a filestream
    $fs = New-Object System.IO.FileStream($fn,[System.IO.FileMode]::CreateNew)
    #create a streamwriter
    $sw = New-Object System.IO.StreamWriter($fs,[System.Text.Encoding]::ASCII,128)
    do{
         #Write the 1267 chars plus eol
         $sw.WriteLine([System.Web.Security.Membership]::GeneratePassword(126,0))
         #decrement the counter
         $count--
    }while($count -gt 0)
    #close the streamwriter
    $sw.Close()
    #close the filestream
    $fs.Close()
    

    You could also use a stringbuilder, and GUIDs to generate pseudorandom numbers and lowercase.

    #get a random filename in the present working directory
    $fn = [System.IO.Path]::Combine($pwd, [GUID]::NewGuid().ToString("N") + '.txt')
    #set number of iterations
    $count = 1mb/128
    #create a filestream
    $fs = New-Object System.IO.FileStream($fn,[System.IO.FileMode]::CreateNew)
    #create a streamwriter
    $sw = New-Object System.IO.StreamWriter($fs,[System.Text.Encoding]::ASCII,128)
    do{
        $sb = New-Object System.Text.StringBuilder 126,126
        0..3 | %{$sb.Append([GUID]::NewGuid().ToString("N"))} 2> $null
        $sw.WriteLine($sb.ToString())
        #decrement the counter
        $count--
    }while($count -gt 0)
    #close the streamwriter
    $sw.Close()
    #close the filestream
    $fs.Close()
    

    This takes about 4 seconds and generates the following sample:

    1fef6ccabc624e4dbe13a0415764fd2c58aa873377c7465eaecabdf6ba6fdf71c55496600a374c4c8cff75be46b1fe474230231ffccc4e3aa2753391afb32c
    

    If you are hell bent to use the same chars as in your sample you can do so with the following:

    #get a random filename in the present working directory
    $fn = [System.IO.Path]::Combine($pwd, [GUID]::NewGuid().ToString("N") + '.txt')
    #array of valid chars
    $chars = [char[]] ([char]'0'..[char]'9' + [char]'A'..[char]'Z' + [char]'a'..[char]'z')
    #create a random object
    $rand = New-Object System.Random
    #set number of iterations
    $count = 1mb/128
    #get length of valid character array
    $charslength = $chars.length
    #create a filestream
    $fs = New-Object System.IO.FileStream($fn,[System.IO.FileMode]::CreateNew)
    #create a streamwriter
    $sw = New-Object System.IO.StreamWriter($fs,[System.Text.Encoding]::ASCII,128)
    do{
        #get 126 random chars This is the major slowdown
        $randchars = 1..126 | %{$chars[$rand.Next(0,$charslength)]}
        #Write the 1267 chars plus eol
        $sw.WriteLine([System.Text.Encoding]::ASCII.GetString($randchars))
        #decrement the counter
        $count--
    }while($count -gt 0)
    #close the streamwriter
    $sw.Close()
    #close the filestream
    $fs.Close()
    

    This takes ~27 seconds and generates the following sample:

    Fev31lweOXaYKELzWOo1YJn8LpZoxonWjxQYhgZbR62EmgjHit5J1LrvqniBB7hZj4pNonIpoCZSHYLf5H63iUUN6UhtyOQKPSViqMTvbGUomPeIR36t1drEZSHJ6O
    

    Indexing the char array and the out-file -Append opening and closing the file each time is a major slowdown.

    0 讨论(0)
  • 2020-12-10 23:48

    One of the bottlenecks is calling the get-random cmdlet in the loop. On my machine that join takes ~40ms. If you change to something like:

    %{ -join ((get-random -InputObject $chars -Count 62) + (get-random -InputObject $chars -Count 62) + (get-random -InputObject $chars -Count 2)) }
    

    it is reduced to ~1ms.

    0 讨论(0)
提交回复
热议问题