Batch file to split .csv file

前端 未结 6 773
醉酒成梦
醉酒成梦 2020-11-29 04:44

I have a very large .csv file (>500mb) and I wish to break this up into into smaller .csv files in command prompt. (Basically trying to find a linux \"split\" function in Wi

相关标签:
6条回答
  • 2020-11-29 05:02

    Try this out:

    @echo off
    setLocal EnableDelayedExpansion
    
    set limit=20000
    set file=export.csv
    set lineCounter=1
    set filenameCounter=1
    
    set name=
    set extension=
    for %%a in (%file%) do (
        set "name=%%~na"
        set "extension=%%~xa"
    )
    
    for /f "tokens=*" %%a in (%file%) do (
        set splitFile=!name!-part!filenameCounter!!extension!
        if !lineCounter! gtr !limit! (
            set /a filenameCounter=!filenameCounter! + 1
            set lineCounter=1
            echo Created !splitFile!.
        )
        echo %%a>> !splitFile!
    
        set /a lineCounter=!lineCounter! + 1
    )
    

    As shown in the code above, it will split the original csv file into multiple csv file with a limit of 20 000 lines. All you have to do is to change the !file! and !limit! variable accordingly. Hope it helps.

    0 讨论(0)
  • 2020-11-29 05:08

    This will give you lines 1 to 20000 in newfile1.csv
    and lines 20001 to the end in file newfile2.csv

    It overcomes the 8K character limit per line too.

    This uses a helper batch file called findrepl.bat from - https://www.dropbox.com/s/rfdldmcb6vwi9xc/findrepl.bat

    Place findrepl.bat in the same folder as the batch file or on the path.

    It's more robust than a plain batch file, and quicker too.

    findrepl /o:1:20000 <file.csv >newfile1.csv
    findrepl /o:20001   <file.csv >newfile2.csv
    
    0 讨论(0)
  • 2020-11-29 05:14

    Use the cgwin command SPLIT. Samples

    To split a file every 500 lines counts:

    split -l 500 [filename.ext]
    

    by default, it adds xa,xb,xc... to filename after extension

    To generate files with numbers and ending in correct extension, use following

    split -l 1000 sourcefilename.ext destinationfilename -d --additional-suffix=.ext
    

    the position of -d or -l does not matter,

    • "-d" is same as −−numeric−suffixes
    • "-l" is same as --lines

    For more: split --help

    0 讨论(0)
  • 2020-11-29 05:15

    A free windows app that does that

    http://www.addictivetips.com/windows-tips/csv-splitter-for-windows/

    0 讨论(0)
  • 2020-11-29 05:15

    I found this question while looking for a similar solution. I modified the answer that @Dale gave to suit my purposes. I wanted something that was a little more flexible and had some error trapping. Just thought I might put it here for anyone looking for the same thing.

    @echo off
    setLocal EnableDelayedExpansion
    GOTO checkvars
    
    :checkvars
        IF "%1"=="" GOTO syntaxerror
        IF NOT "%1"=="-f"  GOTO syntaxerror
        IF %2=="" GOTO syntaxerror
        IF NOT EXIST %2 GOTO nofile
        IF "%3"=="" GOTO syntaxerror
        IF NOT "%3"=="-n" GOTO syntaxerror
        IF "%4"==""  GOTO syntaxerror
        set param=%4
        echo %param%| findstr /xr "[1-9][0-9]* 0" >nul && (
            goto proceed
        ) || (
            echo %param% is NOT a valid number
            goto syntaxerror
        )
    
    :proceed
        set limit=%4
        set file=%2
        set lineCounter=1+%limit%
        set filenameCounter=0
    
        set name=
        set extension=
    
        for %%a in (%file%) do (
            set "name=%%~na"
            set "extension=%%~xa"
        )
    
        for /f "usebackq tokens=*" %%a in (%file%) do (
            if !lineCounter! gtr !limit! (
                set splitFile=!name!_part!filenameCounter!!extension!
                set /a filenameCounter=!filenameCounter! + 1
                set lineCounter=1
                echo Created !splitFile!.
            )
            cls
            echo Adding Line !splitFile! - !lineCounter!
            echo %%a>> !splitFile!
            set /a lineCounter=!lineCounter! + 1
        )
        echo Done!
        goto end
    :syntaxerror
        Echo Syntax: %0 -f Filename -n "Number Of Rows Per File"
        goto end
    :nofile
        echo %2 does not exist
        goto end
    :end
    
    0 讨论(0)
  • 2020-11-29 05:24

    If splitting very large files, the solution I found is an adaptation from this, with PowerShell "embedded" in a batch file. This works fast, as opposed to many other things I tried (I wouldn't know about other options posted here).

    The way to use mysplit.bat below is

    mysplit.bat <mysize> 'myfile'

    Note: The script was intended to use the first argument as the split size. It is currently hardcoded at 100Mb. It should not be difficult to fix this.

    Note 2: The filname should be enclosed in single quotes. Other alternatives for quoting apparently do not work.

    Note 3: It splits the file at given number of bytes, not at given number of lines. For me this was good enough. Some lines of code could be probably added to complete each chunk read, up to the next CR/LF. This will split in full lines (not with a constant number of them), with no sacrifice in processing time.

    Script mysplit.bat:

    @REM Using https://stackoverflow.com/questions/19335004/how-to-run-a-powershell-script-from-a-batch-file
    @REM and https://stackoverflow.com/questions/1001776/how-can-i-split-a-text-file-using-powershell
    @PowerShell  ^
        $upperBound = 100MB;  ^
        $rootName = %2;  ^
        $from = $rootName;  ^
        $fromFile = [io.file]::OpenRead($from);  ^
        $buff = new-object byte[] $upperBound;  ^
        $count = $idx = 0;  ^
        try {  ^
            do {  ^
                'Reading ' + $upperBound;  ^
                $count = $fromFile.Read($buff, 0, $buff.Length);  ^
                if ($count -gt 0) {  ^
                    $to = '{0}.{1}' -f ($rootName, $idx);  ^
                    $toFile = [io.file]::OpenWrite($to);  ^
                    try {  ^
                        'Writing ' + $count + ' to ' + $to;  ^
                        $tofile.Write($buff, 0, $count);  ^
                    } finally {  ^
                        $tofile.Close();  ^
                    }  ^
                }  ^
                $idx ++;  ^
            } while ($count -gt 0);  ^
        }  ^
        finally {  ^
            $fromFile.Close();  ^
        }  ^
    %End PowerShell%
    
    0 讨论(0)
提交回复
热议问题