Batch For loop doesn't refresh the file it's pulling from

前端 未结 4 936
暖寄归人
暖寄归人 2021-01-25 05:21

So I have a for loop that does an iteration of a SQL stored procedure for every line in a file queue.txt, now that all works great, what DOESNT however is that if i

相关标签:
4条回答
  • 2021-01-25 05:44

    Okay so the soultion to my problem that I worked out was to add an extra batch file called co-ordinator.bat it checked if busy.txt was present, if it was then it would add the connecting devices into a file late.txt at the end of each iteration of the loop the process would check for the presence of late.txt, if it was present then it would merge it with queue.txt and then use a goto out of the loop to the top to re-initialise the for loop.

    Code as such:

    @echo off
    cd "%UserProfile%\Desktop\Scripting\"
    echo words > busy.txt
    :rerun
    
    FOR /f "delims=" %%a in ('type queue.txt') DO (
    IF NOT EXIST reset.sql (
    
    ::Create SQL command
    echo USE dbname> reset.sql
    echo EXEC dbo.sp_ResetSubscription @ClientName = '%%a'>> reset.sql
    echo EXEC dbo.sp_RunClientSnapshot @ClientName = '%%a'>> reset.sql
    echo #################### %date% - %time% ####################################################>> log.txt
    echo Reinitialising '%%a'>> log.txt
    sqlcmd -i "reset.sql">> log.txt
    echo. >> log.txt
    echo ####################################################################################################>> log.txt
    echo. >> log.txt
    
    type queue.txt | findstr /v %%a> new.txt
    type new.txt> queue.txt
    echo New list of laptops waiting:>> log.txt
    type queue.txt>> log.txt
    echo. >> log.txt
    echo ####################################################################################################>> log.txt
    echo. >> log.txt
    
    if exist reset.sql del /f /q reset.sql
    if exist late.txt (
    type late.txt>> queue.txt
    del /f /q late.txt
    goto rerun
    )
    ) 
    )
    
    if exist late.txt del /f /q late.txt
    if exist busy.txt del /f /q busy.txt
    if exist queue.txt del /f /q queue.txt
    if exist new.txt del /f /q new.txt
    
    0 讨论(0)
  • 2021-01-25 05:53

    @Myles Gray - Your solution has some problems.

    First the minor problems:

    1) After each iteration of the queue loop, you recreate the queue as the original queue minus the line you are currently working on (you hope! more on that later). After you recreate the queue you append it to your log. That will work, but it seems very innefficient and has the potential of making the log massive and unweildy. Suppose you have a queue with 10,000 lines. By the time you have processed your queue you will have written 99,989,998 queue lines, including 49,994,999 queue lines to your log! That will take a long time to process, even without actually doing your work.

    2) You recreate the queue by using FINDSTR, preserving all lines that don't match your current ID. But this will also strip out subsequent lines if they happen to match your current ID. That might not be a problem. But you are doing a substring match. Your FINDSTR will also eliminate subsequent lines that contain your current ID anywhere within it. I have no idea what your IDs look like. But if your current ID is 123, then all of the following IDs will be stripped erroneously - 31236, 12365 etc. That is a potentially devestating problem. I say it is potential because the FOR loop has already buffered the queue, so it doesn't care - unless you abort the loop because new work has been appended to the late.txt file - then you actually will skip those missing IDs! This could be fixed by adding the /X option to FINDSTR. At least then you will only be skipping true duplicates.

    Now the major problems - all stemming from the fact only one process can have a file open for any kind of write (or delete) operation.

    3) Even though a FOR /F loop does not write to the file, it is designed to fail if the file is actively being written to by another process. So if your FOR loop attempts to read the queue while another process is appending to it, your queue processing script will fail. You have the busy.txt file check, but your queue writer might have already started writing before the busy.txt file has been created. The write operation might take a while, especially if many lines are being appended. While the lines are being written your queue processor could start and then you have your collision and failure.

    4) Your queue processor appends the late.txt to your queue and then deletes late.txt. But there is point of time between the append and the delete where a queue writer could append an additional line to late.txt. This late arriving line will be deleted without having been processed!

    5) Another possibility is a writer may attempt to write to the late.txt while it is in the process of being deleted by the queue processor. The write will fail, and again your queue will be missing work.

    6) Yet another possibility is your queue may attempt to delete late.txt while a queue writer is appending to it. The delete will fail, and you will end up with duplicates in your queue the next time the queue processor appends late.txt to queue.txt.

    In summary, concurrency issues can lead both to missing work in your queue, as well as duplicate work in your queue. Whenever you have multiple processes making changes to a file simultaneously, you MUST establish some kind of locking mechanism to serialize the events.

    You are already using a SqlServer database. The most logical thing to do is to move your queue out of the file system and into the database. Relational databases are built from the ground up to deal with concurrency.

    That being said, it is not too difficult to use a file as a queue within Windows batch as long as you employ a locking strategy. You must make sure both your queue processor and your queue writers follow the same locking strategy.

    Below is a file based solution. I'm going to assume you only have one queue processor, and possibly multiple queue writers. With additional work you can adapt the file queue solution to support multiple queue processors. But multiple queue processors is probably easier to implement using the folder based queue that I described at the end of my first answer.

    Instead of having the queue writers write to either queue.txt or late.txt, it is easier to have the queue processor rename the existing queue and process it to completion, while the queue writers always write to queue.txt.

    This solution writes the current status to a status.txt file. You can monitor your queue processor status by issuing TYPE STATUS.TXT from a command window.

    I do some delayed expansion toggling to protect against corruption due to ! in your data. If you know that ! will never appear, then you can simply move the SETLOCAL EnableDelayedExpansion to the top and forgo the toggling.

    One other optimisation - it is faster to redirect output just once for a group of statements instead of opening and closing the file for each statement.

    This code is totally untested, so there could easily be some silly bugs. But the concepts are sound. Hopefully you get the idea.

    queueProcessor.bat

    @echo off
    setlocal disableDelayedExpansion
    cd "%UserProfile%\Desktop\Scripting\"
    
    :rerun
    
    ::Safely get a copy of the current queue, exit if none or error
    call :getQueue || exit /b
    
    ::Get the number of lines in the queue to be used in status updates
    for /f %%n in ('find /v "" ^<inProcess.txt') do set /a "record=0, recordCount=%%n"
    
    ::Main processing loop
    for /f "delims=" %%a in (inProcess.txt) do (
    
      rem :: Update the status. Need delayed expansion to access the current record number.
      rem :: Need to toggle delayed expansion in case your data contains !
      setlocal enableDelayedExpansion
      set /a "record+=1"
      > status.txt echo processing !record! out of %recordCount%
      endlocal
    
      rem :: Create SQL command
      > reset.sql (
        echo USE dbname
        echo EXEC dbo.sp_ResetSubscription @ClientName = '%%a'
        echo EXEC dbo.sp_RunClientSnapshot @ClientName = '%%a'
      )
    
      rem :: Log this action and execute the SQL command
      >> log.txt (
        echo #################### %date% - %time% ####################################################
        echo Reinitialising '%%a'
        sqlcmd -i "reset.sql"
        echo.
        echo ####################################################################################################
        echo.
      )
    )
    
    ::Clean up
    delete inProcess.txt
    delete status.txt
    
    ::Look for more work
    goto :rerun
    
    :getQueue
    2>nul (
      >queue.lock (
        if not exist queue.txt exit /b 1
        if exist inProcess.txt (
          echo ERROR: Only one queue processor allowed at a time
          exit /b 2
        )
        rename queue.txt inProcess.txt
      )
    )||goto :getQueue
    exit /b 0
    

    queueWriter.bat

    ::Whatever your code is
    ::At some point you want to append a VALUE to the queue in a safe way
    call :appendQueue VALUE
    ::continue on until done
    exit /b
    
    :appendQueue
    2>nul (
      >queue.lock (
        >>queue.txt echo %*
      )
    )||goto :appendQueue
    

    Explanation of the lock code:

    :retry
    ::First redirect any error messages that occur within the outer block to nul
    2>nul (
    
      rem ::Next redirect all stdout within the inner block to queue.lock
      rem ::No output will actually go there. But the file will be created
      rem ::and this process will have a lock on the file until the inner
      rem ::block completes. Any other process that tries to write to this
      rem ::file will fail. If a different process already has queue.lock 
      rem ::locked, then this process will fail to get the lock and the inner
      rem ::block will not execute. Any error message will go to nul.
      >queue.lock (
    
        rem ::you can now safely manipulate your queue because you have an
        rem ::exclusive lock.
        >>queue.txt echo data 
    
        rem ::If some command within the inner block can fail, then you must
        rem ::clear the error at the end of the inner block. Otherwise this
        rem ::routine can get stuck in an endless loop. You might want to 
        rem ::add this to my code - it clears any error.
        verify >nul
    
      ) && (
    
        rem ::I've never done this before, but if the inner block succeeded,
        rem ::then I think you can attempt to delete queue.lock at this point.
        rem ::If the del succeeds then you know that no process has a lock
        rem ::at this point. This could be useful if you are trying to monitor
        rem ::the processes. If the del fails then that means some other process
        rem ::has already grabbed the lock. You need to clear the error at
        rem ::this point to prevent the endless loop
        del queue.lock || verify >nul
    
      )
    
    ) || goto :retry
    :: If the inner block failed to get the lock, then the conditional GOTO
    :: activates and it loops back to try again. It continues to loop until
    :: the lock succeeds. Note - the :retry label must be above the outer-
    :: most block.
    

    If you have a unique process ID, you can write it to queue.lock within the inner block. Then you can type queue.lock from another window to find out which process currently has (or most recently had) the lock. That should only be an issue if some process hangs.

    0 讨论(0)
  • 2021-01-25 05:56

    You are absolutely correct - A FOR /F loop waits for the command in the IN() clause to finish and buffers the result prior to processing the 1st line. The same is true if you read from a file within the IN() clause instead of executing a command.

    Your proposed strategy of counting the number of lines in the queue prior to the FOR loop, and then recounting after the FOR loop has completed could just about work if you stop mucking with the queue contents within the FOR loop. If the final count is greater than the original you could GOTO a :label before the FOR loop and skip the original line count in the FOR loop so you only process the appended lines. But you would still have a concurrency issue if a process writes to the queue while you are getting the line count or if it appends to the queue after you get the final count but before you delete the queue.

    There are ways to serialize events within batch when dealing with multiple processes. The key to doing this is to take advantage of the fact that only one process can have a file open for write access.

    Code like the following can be used to establish an exclusive "lock". As long as every process uses the same logic, you can guarantee you have exclusive control over one or more file system objects until you release the lock by exiting the block of code.

    :getLock
    2>nul (
      >lockName.lock (
        rem ::You now have an exclusive lock while you remain in this block of code
        rem ::You can safely count the number of lines in a queue file,
        rem ::or append lines to the queue file at this time.
      )
    )||goto :getLock
    

    I demonstrated how this could work at Re: parallel process with batch. After pressing the link, scroll up to see the original question. It seems like a very similar problem to yours.

    You might want to consider using a folder as a queue instead of a file. Each unit of work can be it's own file within the folder. You can use a lock to safely increment a sequence number in a file to be used in naming each unit of work. You can guarantee the unit of work has been completely written by preparing it in a "preperation" folder and only move it to the "queue" folder after it is complete. The advantage to this strategy is that each unit of work file can be moved to an "inProcess" folder while the processing is happening, and then it can be deleted or moved to an archive folder when finished. If the processing fails, you can recover because the file still exists in the "inProcess" folder. You are in a position to know which units of work are unstable (the dead ones in the "inProcess" folder), as well as which units of work have yet to be processed at all (those still in the "queue" folder).

    0 讨论(0)
  • 2021-01-25 06:09

    You put in your question "if another line is added to the bottom of the file..."; however, your code does not add a line, but completely replaces the entire file contents (although the new contents just have one new line added):

    FOR /f "delims=" %%a in ('type queue.txt') DO (
       IF NOT EXIST reset.sql (
    
       . . .
    
       type queue.txt | findstr /v %%a> new.txt
       rem Next line REPLACES the entire queue.txt file!
       type new.txt> queue.txt
       echo New list of laptops waiting:>> log.txt
    
       . . .
    
       if exist reset.sql del /f /q reset.sql
    
       ) 
    )
    

    You may change the method to process queue.txt file by redirecting it into a subroutine that read its lines via SET /P command and a loop assembled with GOTO. This way, the lines that be added to the bottom of queue.txt file inside the read loop will be immediately read when the read process reaches they.

    call :ProcessQueue < queue.txt >> queue.txt
    goto :EOF
    
    
    :ProcessQueue
       set line=
       rem Next command read a line from queue.txt file:
       set /P line=
       if not defined line goto endProcessQueue
       rem In following code use %line% instead of %%a
       IF NOT EXIST reset.sql (
    
       . . .
    
       type queue.txt | findstr /v %%a> new.txt
       rem Next command ADD new lines to queue.txt file:
       type new.txt
       echo New list of laptops waiting:>> log.txt
    
       . . .
    
       if exist reset.sql del /f /q reset.sql
    
       ) 
    goto ProcessQueue
    :endProcessQueue
    exit /B
    

    Of course, if the new lines are added by other processes the new lines will be read and processed by this Batch file automatically.

    You must be aware that this method ends at the first empty line in queue.txt file; it also have some restrictions in the characters that it can process.

    EDIT: This is a simple example that show how this method work:

    set i=0
    call :ProcessQueue < queue.txt >> queue.txt
    goto :EOF
    
    :ProcessQueue
       set line=
       set /P line=
       if not defined line goto endProcessQueue
       echo Line processed: %line% > CON
       set /A i=i+1
       if %i% == 1 echo First line added to queue.txt
       if %i% == 2 echo Second line added to queue.txt
    goto ProcessQueue
    :endProcessQueue
    exit /B
    

    This is queue.txt file at input:

    Original first line
    Original second line
    Original third line
    Original fourth line
    

    This is the result:

    Line processed: Original first line
    Line processed: Original second line
    Line processed: Original third line
    Line processed: Original fourth line
    Line processed: First line added to queue.txt
    Line processed: Second line added to queue.txt
    
    0 讨论(0)
提交回复
热议问题