So I have a for loop that does an iteration of a SQL stored procedure for every line in a file queue.txt
, now that all works great, what DOESNT however is that if i
@Myles Gray - Your solution has some problems.
First the minor problems:
1) After each iteration of the queue loop, you recreate the queue as the original queue minus the line you are currently working on (you hope! more on that later). After you recreate the queue you append it to your log. That will work, but it seems very innefficient and has the potential of making the log massive and unweildy. Suppose you have a queue with 10,000 lines. By the time you have processed your queue you will have written 99,989,998 queue lines, including 49,994,999 queue lines to your log! That will take a long time to process, even without actually doing your work.
2) You recreate the queue by using FINDSTR, preserving all lines that don't match your current ID. But this will also strip out subsequent lines if they happen to match your current ID. That might not be a problem. But you are doing a substring match. Your FINDSTR will also eliminate subsequent lines that contain your current ID anywhere within it. I have no idea what your IDs look like. But if your current ID is 123, then all of the following IDs will be stripped erroneously - 31236, 12365 etc. That is a potentially devestating problem. I say it is potential because the FOR loop has already buffered the queue, so it doesn't care - unless you abort the loop because new work has been appended to the late.txt file - then you actually will skip those missing IDs! This could be fixed by adding the /X option to FINDSTR. At least then you will only be skipping true duplicates.
Now the major problems - all stemming from the fact only one process can have a file open for any kind of write (or delete) operation.
3) Even though a FOR /F loop does not write to the file, it is designed to fail if the file is actively being written to by another process. So if your FOR loop attempts to read the queue while another process is appending to it, your queue processing script will fail. You have the busy.txt file check, but your queue writer might have already started writing before the busy.txt file has been created. The write operation might take a while, especially if many lines are being appended. While the lines are being written your queue processor could start and then you have your collision and failure.
4) Your queue processor appends the late.txt to your queue and then deletes late.txt. But there is point of time between the append and the delete where a queue writer could append an additional line to late.txt. This late arriving line will be deleted without having been processed!
5) Another possibility is a writer may attempt to write to the late.txt while it is in the process of being deleted by the queue processor. The write will fail, and again your queue will be missing work.
6) Yet another possibility is your queue may attempt to delete late.txt while a queue writer is appending to it. The delete will fail, and you will end up with duplicates in your queue the next time the queue processor appends late.txt to queue.txt.
In summary, concurrency issues can lead both to missing work in your queue, as well as duplicate work in your queue. Whenever you have multiple processes making changes to a file simultaneously, you MUST establish some kind of locking mechanism to serialize the events.
You are already using a SqlServer database. The most logical thing to do is to move your queue out of the file system and into the database. Relational databases are built from the ground up to deal with concurrency.
That being said, it is not too difficult to use a file as a queue within Windows batch as long as you employ a locking strategy. You must make sure both your queue processor and your queue writers follow the same locking strategy.
Below is a file based solution. I'm going to assume you only have one queue processor, and possibly multiple queue writers. With additional work you can adapt the file queue solution to support multiple queue processors. But multiple queue processors is probably easier to implement using the folder based queue that I described at the end of my first answer.
Instead of having the queue writers write to either queue.txt or late.txt, it is easier to have the queue processor rename the existing queue and process it to completion, while the queue writers always write to queue.txt.
This solution writes the current status to a status.txt file. You can monitor your queue processor status by issuing TYPE STATUS.TXT
from a command window.
I do some delayed expansion toggling to protect against corruption due to !
in your data. If you know that !
will never appear, then you can simply move the SETLOCAL EnableDelayedExpansion to the top and forgo the toggling.
One other optimisation - it is faster to redirect output just once for a group of statements instead of opening and closing the file for each statement.
This code is totally untested, so there could easily be some silly bugs. But the concepts are sound. Hopefully you get the idea.
queueProcessor.bat
@echo off
setlocal disableDelayedExpansion
cd "%UserProfile%\Desktop\Scripting\"
:rerun
::Safely get a copy of the current queue, exit if none or error
call :getQueue || exit /b
::Get the number of lines in the queue to be used in status updates
for /f %%n in ('find /v "" ^ status.txt echo processing !record! out of %recordCount%
endlocal
rem :: Create SQL command
> reset.sql (
echo USE dbname
echo EXEC dbo.sp_ResetSubscription @ClientName = '%%a'
echo EXEC dbo.sp_RunClientSnapshot @ClientName = '%%a'
)
rem :: Log this action and execute the SQL command
>> log.txt (
echo #################### %date% - %time% ####################################################
echo Reinitialising '%%a'
sqlcmd -i "reset.sql"
echo.
echo ####################################################################################################
echo.
)
)
::Clean up
delete inProcess.txt
delete status.txt
::Look for more work
goto :rerun
:getQueue
2>nul (
>queue.lock (
if not exist queue.txt exit /b 1
if exist inProcess.txt (
echo ERROR: Only one queue processor allowed at a time
exit /b 2
)
rename queue.txt inProcess.txt
)
)||goto :getQueue
exit /b 0
queueWriter.bat
::Whatever your code is
::At some point you want to append a VALUE to the queue in a safe way
call :appendQueue VALUE
::continue on until done
exit /b
:appendQueue
2>nul (
>queue.lock (
>>queue.txt echo %*
)
)||goto :appendQueue
Explanation of the lock code:
:retry
::First redirect any error messages that occur within the outer block to nul
2>nul (
rem ::Next redirect all stdout within the inner block to queue.lock
rem ::No output will actually go there. But the file will be created
rem ::and this process will have a lock on the file until the inner
rem ::block completes. Any other process that tries to write to this
rem ::file will fail. If a different process already has queue.lock
rem ::locked, then this process will fail to get the lock and the inner
rem ::block will not execute. Any error message will go to nul.
>queue.lock (
rem ::you can now safely manipulate your queue because you have an
rem ::exclusive lock.
>>queue.txt echo data
rem ::If some command within the inner block can fail, then you must
rem ::clear the error at the end of the inner block. Otherwise this
rem ::routine can get stuck in an endless loop. You might want to
rem ::add this to my code - it clears any error.
verify >nul
) && (
rem ::I've never done this before, but if the inner block succeeded,
rem ::then I think you can attempt to delete queue.lock at this point.
rem ::If the del succeeds then you know that no process has a lock
rem ::at this point. This could be useful if you are trying to monitor
rem ::the processes. If the del fails then that means some other process
rem ::has already grabbed the lock. You need to clear the error at
rem ::this point to prevent the endless loop
del queue.lock || verify >nul
)
) || goto :retry
:: If the inner block failed to get the lock, then the conditional GOTO
:: activates and it loops back to try again. It continues to loop until
:: the lock succeeds. Note - the :retry label must be above the outer-
:: most block.
If you have a unique process ID, you can write it to queue.lock within the inner block. Then you can type queue.lock from another window to find out which process currently has (or most recently had) the lock. That should only be an issue if some process hangs.