file 1
A
B
C
file 2
B
C
D
file1 + file2 =
A
B
C
D
Is it possible to do
The solution below assume that both input files are sorted in ascending order using the same order of IF
command's comparison operators and that does not contain empty lines.
@echo off
setlocal EnableDelayedExpansion
set "lastLine=ÿ"
for /L %%i in (1,1,10) do set "lastLine=!lastLine!!lastLine!"
< file1.txt (
for /F "delims=" %%a in (file2.txt) do (
set "line2=%%a"
if not defined line1 set /P line1=
if "!line1!" lss "!line2!" call :advanceLine1
if "!line1!" equ "!line2!" (
echo !line1!
set "line1="
) else (
echo !line2!
)
)
)
if "!line1!" neq "%lastLine%" echo !line1!
goto :EOF
:advanceLine1
echo !line1!
set "line1="
set /P line1=
if not defined line1 set "line1=%lastLine%"
if "!line1!" lss "!line2!" goto advanceLine1
exit /B
If you can affort to use a case insensitive comparison, and if you know that none of the lines are longer than 511 bytes (127 for XP), then you can use the following:
@echo off
copy file1.txt merge.txt >nul
findstr /lvxig:file1.txt file2.txt >>merge.txt
type merge.txt
For an explanation of the restrictions, see What are the undocumented features and limitations of the Windows FINDSTR command?.
First part (merging two text files) is possible. (See Documentation of copy command)
copy file1.txt+file2.txt file1and2.txt
For part 2, you can use sort
and uniq
utilities from CoreUtils for Windows. This are windows port of the linux utilities.
sort file1and2.txt filesorted.txt
uniq filesorted.txt fileunique.txt
This has a limitation that you will lose track of original sequencing.
Update 1
Windows also ships with a native SORT.EXE.
Update 2
Here is a very simple UNIQ in CMD script
You may also use the same approach of Unix or PowerShell with pure Batch, developing a simple uniq.bat
filter program:
@echo off
setlocal EnableDelayedExpansion
set "prevLine="
for /F "delims=" %%a in ('findstr "^"') do (
if "%%a" neq "!prevLine!" (
echo %%a
set "prevLine=%%a"
)
)
EDIT: The program below is a Batch-JScript hybrid version of uniq
program, more reliable and faster; copy this program in a file called uniq.bat
:
@if (@CodeSection == @Batch) @then
@CScript //nologo //E:JScript "%~F0" & goto :EOF
@end
var line, prevLine = "";
while ( ! WScript.Stdin.AtEndOfStream ) {
line = WScript.Stdin.ReadLine();
if ( line != prevLine ) {
WScript.Stdout.WriteLine(line);
prevLine = line;
}
}
This way, you may use this solution:
(type file1.txt & type file2.txt) | sort | uniq > result.txt
However, in this case the result lost the original order.
Using PowerShell:
Get-Content file?.txt | Sort-Object | Get-Unique > result.txt
For cmd.exe
:
@echo off
type nul > temp.txt
type nul > result.txt,
copy file1.txt+file2.txt temp.txt
for /f "delims=" %%I in (temp.txt) do findstr /X /C:"%%I" result.txt >NUL ||(echo;%%I)>>result.txt
del temp.txt