bat file to replace string in text file

前端 未结 4 601
失恋的感觉
失恋的感觉 2021-01-22 04:23

This question has been asked a lot on stackoverflow, but I can\'t seem to be able to make it work. Any hints appreciated. Here is a text file (extension .mpl) containing offendi

相关标签:
4条回答
  • 2021-01-22 04:29

    You defined delims=<space>, that's a bad idea if you want to preserve your lines, as it splits after the first space.
    You should change this to FOR /F "tokens=* delims=" ....

    Your echo !str! >> testCleaned.mpl will always append one extra space to each line, better use echo(!str!>>testCleaned.mpl.

    You will also lose all empty lines, and all exclamation marks in all lines.

    You could also try the code of Improved BatchSubstitute.bat

    0 讨论(0)
  • 2021-01-22 04:36

    The biggest problem with your existing code is the SetLocal enableDelayedExpansion is missplaced - it should be within the loop after set str=%%I.

    Other problems:

    • will strip lines beginning with ;
    • will strip leading spaces from each line
    • will strip blank (empty) lines
    • will print ECHO is off if any lines becomes empty or contains only spaces after substitution
    • will add extra space at end of each line (didn't notice this until I read jeb's answer)

    Optimization issue - using >> can be relatively slow. It is faster to enclose the whole loop in () and then use >

    Below is about the best you can do with Windows batch. I auto named the output as requested, doing one better - It automatically preserves the extension of the original name.

    @echo off
    SetLocal
    cd /d %~dp0
    Set "OldString=[HFloat(undefined),HFloat(undefined),HFloat(undefined)],"
    Set "NewString="
    set file="test.mpl"
    for %%F in (%file%) do set outFile="%%~nFCleaned%%~xF"
    pause
    (
      for /f "skip=2 delims=" %%a in ('find /n /v "" %file%') do (
        set "ln=%%a"
        setlocal enableDelayedExpansion
        set "ln=!ln:*]=!"
        if defined ln set "ln=!ln:%OldString%=%NewString%!"
        echo(!ln!
        endlocal
      )
    )>%outFile%
    

    Known limitations

    • limited to slightly under 8k per line, both before and after substitution
    • search string cannot include = or !, nor can it start with * or ~
    • replacement string cannot include !
    • search part of search and replace is case insensitive
    • last line will always end with newline <CR><LF> even if original did not

    All but the first limitation could be eliminated, but it would require a lot of code, and would be horrifically slow. The solution would require a character by character search of each line. The last limitation would require some awkward test to determine if the last line was newline terminated, and then last line would have to be printed using <nul SET /P "ln=!ln!" trick if no newline wanted.

    Interesting feature (or limitation, depending on perspective)

    • Unix style files ending lines with <LF> will be converted to Windows style with lines ending with <CR><LF>

    There are other solutions using batch that are significantly faster, but they all have more limitations.

    Update - I've posted a new pure batch solution that is able to do case sensitive searches and has no restrictions on search or replacement string content. It does have more restrictions on line length, trailing control characters, and line format. Performance is not bad, especially if the number of replacements is low. http://www.dostips.com/forum/viewtopic.php?f=3&t=2710

    Addendum

    Based on comments below, a batch solution will not work for this particular problem because of line length limitation.

    But this code is a good basis for a batch based search and replace utility, as long as you are willing to put up with the limitations and relatively poor performance of batch.

    There are much better text processing tools available, though they are not standard with Windows. My favorite is sed within the GNU Utilities for Win32 package. The utilities are free, and do not require any installation.

    Here is a sed solution for Windows using GNU utilities

    @echo off
    setlocal
    cd /d %~dp0
    Set "OldString=\[HFloat(undefined),HFloat(undefined),HFloat(undefined)\],"
    Set "NewString="
    set file="test.mpl"
    for %%F in (%file%) do set outFile="%%~nFCleaned%%~xF"
    pause
    sed -e"s/%OldString%/%NewString%/g" <%file% >%outfile%
    


    Update 2013-02-19

    sed may not be an option if you work at a site that has rules forbidding the installation of executables downloaded from the web.

    JScript has good regular expression handling, and it is standard on all modern Windows platforms, including XP. It is a good choice for performing search and replace operations on Windows platforms.

    I have written a hybrid JScript/Batch search and replace script (REPL.BAT) that is easy to call from a batch script. A small amount of code gives a lot of powerful features; not as powerful as sed, but more than enough to handle this task, as well as many others. It is also quite fast, much faster than any pure batch solution. It also does not have any inherent line length limitations.

    Here is a batch script that uses my REPL.BAT utility to accomplish the task.

    @echo off
    setlocal
    cd /d %~dp0
    Set "OldString=[HFloat(undefined),HFloat(undefined),HFloat(undefined)],"
    Set "NewString="
    set file="test.txt"
    for %%F in (%file%) do set outFile="%%~nFCleaned%%~xF"
    pause
    call repl OldString NewString le <%file% >%outfile%
    

    I use the L option to specify a literal search string instead of a regular expression, and the E option to pass the search and replace strings via environment variables by name, instead of using string literals on the command line.

    Here is the REPL.BAT utility script that the above code calls. Full documentation is encluded within the script.

    @if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment
    
    ::************ Documentation ***********
    :::
    :::REPL  Search  Replace  [Options  [SourceVar]]
    :::REPL  /?
    :::
    :::  Performs a global search and replace operation on each line of input from
    :::  stdin and prints the result to stdout.
    :::
    :::  Each parameter may be optionally enclosed by double quotes. The double
    :::  quotes are not considered part of the argument. The quotes are required
    :::  if the parameter contains a batch token delimiter like space, tab, comma,
    :::  semicolon. The quotes should also be used if the argument contains a
    :::  batch special character like &, |, etc. so that the special character
    :::  does not need to be escaped with ^.
    :::
    :::  If called with a single argument of /? then prints help documentation
    :::  to stdout.
    :::
    :::  Search  - By default this is a case sensitive JScript (ECMA) regular
    :::            expression expressed as a string.
    :::
    :::            JScript syntax documentation is available at
    :::            http://msdn.microsoft.com/en-us/library/ae5bf541(v=vs.80).aspx
    :::
    :::  Replace - By default this is the string to be used as a replacement for
    :::            each found search expression. Full support is provided for
    :::            substituion patterns available to the JScript replace method.
    :::            A $ literal can be escaped as $$. An empty replacement string
    :::            must be represented as "".
    :::
    :::            Replace substitution pattern syntax is documented at
    :::            http://msdn.microsoft.com/en-US/library/efy6s3e6(v=vs.80).aspx
    :::
    :::  Options - An optional string of characters used to alter the behavior
    :::            of REPL. The option characters are case insensitive, and may
    :::            appear in any order.
    :::
    :::            I - Makes the search case-insensitive.
    :::
    :::            L - The Search is treated as a string literal instead of a
    :::                regular expression. Also, all $ found in Replace are
    :::                treated as $ literals.
    :::
    :::            E - Search and Replace represent the name of environment
    :::                variables that contain the respective values. An undefined
    :::                variable is treated as an empty string.
    :::
    :::            M - Multi-line mode. The entire contents of stdin is read and
    :::                processed in one pass instead of line by line. ^ anchors
    :::                the beginning of a line and $ anchors the end of a line.
    :::
    :::            X - Enables extended substitution pattern syntax with support
    :::                for the following escape sequences:
    :::
    :::                \\     -  Backslash
    :::                \b     -  Backspace
    :::                \f     -  Formfeed
    :::                \n     -  Newline
    :::                \r     -  Carriage Return
    :::                \t     -  Horizontal Tab
    :::                \v     -  Vertical Tab
    :::                \xnn   -  Ascii (Latin 1) character expressed as 2 hex digits
    :::                \unnnn -  Unicode character expressed as 4 hex digits
    :::
    :::                Escape sequences are supported even when the L option is used.
    :::
    :::            S - The source is read from an environment variable instead of
    :::                from stdin. The name of the source environment variable is
    :::                specified in the next argument after the option string.
    :::
    
    ::************ Batch portion ***********
    @echo off
    if .%2 equ . (
      if "%~1" equ "/?" (
        findstr "^:::" "%~f0" | cscript //E:JScript //nologo "%~f0" "^:::" ""
        exit /b 0
      ) else (
        call :err "Insufficient arguments"
        exit /b 1
      )
    )
    echo(%~3|findstr /i "[^SMILEX]" >nul && (
      call :err "Invalid option(s)"
      exit /b 1
    )
    cscript //E:JScript //nologo "%~f0" %*
    exit /b 0
    
    :err
    >&2 echo ERROR: %~1. Use REPL /? to get help.
    exit /b
    
    ************* JScript portion **********/
    var env=WScript.CreateObject("WScript.Shell").Environment("Process");
    var args=WScript.Arguments;
    var search=args.Item(0);
    var replace=args.Item(1);
    var options="g";
    if (args.length>2) {
      options+=args.Item(2).toLowerCase();
    }
    var multi=(options.indexOf("m")>=0);
    var srcVar=(options.indexOf("s")>=0);
    if (srcVar) {
      options=options.replace(/s/g,"");
    }
    if (options.indexOf("e")>=0) {
      options=options.replace(/e/g,"");
      search=env(search);
      replace=env(replace);
    }
    if (options.indexOf("l")>=0) {
      options=options.replace(/l/g,"");
      search=search.replace(/([.^$*+?()[{\\|])/g,"\\$1");
      replace=replace.replace(/\$/g,"$$$$");
    }
    if (options.indexOf("x")>=0) {
      options=options.replace(/x/g,"");
      replace=replace.replace(/\\\\/g,"\\B");
      replace=replace.replace(/\\b/g,"\b");
      replace=replace.replace(/\\f/g,"\f");
      replace=replace.replace(/\\n/g,"\n");
      replace=replace.replace(/\\r/g,"\r");
      replace=replace.replace(/\\t/g,"\t");
      replace=replace.replace(/\\v/g,"\v");
      replace=replace.replace(/\\x[0-9a-fA-F]{2}|\\u[0-9a-fA-F]{4}/g,
        function($0,$1,$2){
          return String.fromCharCode(parseInt("0x"+$0.substring(2)));
        }
      );
      replace=replace.replace(/\\B/g,"\\");
    }
    var search=new RegExp(search,options);
    
    if (srcVar) {
      WScript.Stdout.Write(env(args.Item(3)).replace(search,replace));
    } else {
      while (!WScript.StdIn.AtEndOfStream) {
        if (multi) {
          WScript.Stdout.Write(WScript.StdIn.ReadAll().replace(search,replace));
        } else {
          WScript.Stdout.WriteLine(WScript.StdIn.ReadLine().replace(search,replace));
        }
      }
    }
    
    0 讨论(0)
  • 2021-01-22 04:43

    The Batch file below have the same restrictions of previous solutions on characters that can be processed; these restrictions are inherent to all Batch language programs. However, this program should run faster if the file is large and the lines to replace are not too much. Lines with no replacement string are not processed, but directly copied to the output file.

    @echo off
    setlocal EnableDelayedExpansion
    set "oldString=[HFloat(undefined),HFloat(undefined),HFloat(undefined)],"
    set "newString="
    findstr /N ^^ inFile.mpl > numberedFile.tmp
    find /C ":" < numberedFile.tmp > lastLine.tmp
    set /P lastLine=<lastLine.tmp
    del lastLine.tmp
    call :ProcessLines < numberedFile.tmp > outFile.mpl
    del numberedFile.tmp
    goto :EOF
    
    :ProcessLines
    set lastProcessedLine=0
    for /F "delims=:" %%a in ('findstr /N /C:"%oldString%" inFile.mpl') do (
        call :copyUpToLine %%a
        echo(!line:%oldString%=%newString%!
    )
    set /A linesToCopy=lastLine-lastProcessedLine
    for /L %%i in (1,1,%linesToCopy%) do (
        set /P line=
        echo(!line:*:=!
    )
    exit /B
    
    :copyUpToLine number
    set /A linesToCopy=%1-lastProcessedLine-1
    for /L %%i in (1,1,%linesToCopy%) do (
        set /P line=
        echo(!line:*:=!
    )
    set /P line=
    set line=!line:*:=!
    set lastProcessedLine=%1
    exit /B
    

    I would appreciate if you may run a timing test on this an other solutions and post the results.

    EDIT: I changed the set /A lastProcessedLine+=linesToCopy+1 line for the equivalent, but faster set lastProcessedLine=%1.

    0 讨论(0)
  • 2021-01-22 04:50

    I'm no expert on batch files, so I can't offer a direct solution to your problem.

    However, to solve your problem, it might be simpler to use an alternative to batch files.

    For example, I'd recommend using http://www.csscript.net/ (if you know C#). This tool will allow you to run C# files like batch files, but giving you the power to write your script using C#, instead of horrible batch file syntax :)

    Another alternative would be python, if you know python.

    But I guess the point is, that this kind of task may be easier in another programming language.

    0 讨论(0)
提交回复
热议问题