The Windows FINDSTR command is horribly documented. There is very basic command line help available through FINDSTR /?
, or HELP FINDSTR
, but it is
findstr
sometimes hangs unexpectedly when searching large files.
I haven't confirmed the exact conditions or boundary sizes. I suspect any file larger 2GB may be at risk.
I have had mixed experiences with this, so it is more than just file size. This looks like it may be a variation on FINDSTR hangs on XP and Windows 7 if redirected input does not end with LF, but as demonstrated this particular problem manifests when input is not redirected.
The following command line session (Windows 7) demonstrates how findstr
can hang when searching a 3GB file.
C:\Data\Temp\2014-04>echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890> T100B.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,10) do @type T100B.txt >> T1KB.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1KB.txt >> T1MB.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1MB.txt >> T1GB.txt
C:\Data\Temp\2014-04>echo find this line>> T1GB.txt
C:\Data\Temp\2014-04>copy T1GB.txt + T1GB.txt + T1GB.txt T3GB.txt
T1GB.txt
T1GB.txt
T1GB.txt
1 file(s) copied.
C:\Data\Temp\2014-04>dir
Volume in drive C has no label.
Volume Serial Number is D2B2-FFDF
Directory of C:\Data\Temp\2014-04
2014/04/08 04:28 PM .
2014/04/08 04:28 PM ..
2014/04/08 04:22 PM 102 T100B.txt
2014/04/08 04:28 PM 1 020 000 016 T1GB.txt
2014/04/08 04:23 PM 1 020 T1KB.txt
2014/04/08 04:23 PM 1 020 000 T1MB.txt
2014/04/08 04:29 PM 3 060 000 049 T3GB.txt
5 File(s) 4 081 021 187 bytes
2 Dir(s) 51 881 050 112 bytes free
C:\Data\Temp\2014-04>rem Findstr on the 1GB file does not hang
C:\Data\Temp\2014-04>findstr "this" T1GB.txt
find this line
C:\Data\Temp\2014-04>rem On the 3GB file, findstr hangs and must be aborted... even though it clearly reaches end of file
C:\Data\Temp\2014-04>findstr "this" T3GB.txt
find this line
find this line
find this line
^C
C:\Data\Temp\2014-04>
Note, I've verified in a hex editor that all lines are terminated with CRLF
. The only anomaly is that the file is terminated with 0x1A
due to the way copy works. Note however, that this anomaly doesn't cause a problem on "small" files.
With additional testing I have confirmed the following:
copy
with the /b
option for binary files prevents the addition of the 0x1A
character, and findstr
doesn't hang on the 3GB file.findstr
to hang.0x1A
character doesn't cause any problems on a "small" file. (Similarly for other terminating characters.)CRLF
after 0x1A
resolves the problem. (LF
by itself would probably suffice.)type
to pipe the file into findstr
works without hanging. (This might be due to a side effect of either type
or |
that inserts an additional End Of Line.)<
also causes findstr
to hang. But this is expected; as explained in dbenham's post: "redirected input must end in LF
".