Comparing two text files in python

前端 未结 4 1229
执念已碎
执念已碎 2021-02-05 12:20

I need to compare two files and redirect the different lines to third file. I know using diff command i can get the difference . But, is there any way of doing it in python ? An

相关标签:
4条回答
  • 2021-02-05 12:53
    #compare 2 text files.
    
    test1filehandle = open("test1.txt", "r") #creating a file handle
    test2filehandle=open("test2.txt","r") #creating a file handle to read
    test3filehandle=open("test3.txt","w") #creating a file handle to write
    test1list= test1filehandle.readlines() #read the lines and store in the list
    test2list=test2filehandle.readlines()
    k=1
    for i,j in zip(test1list,test2list): #zip is used to iterate the variablea in 2 lists simultaneoously   
        if i !=j:
            test3filehandle.write("Line Number:" +str(k)+' ')
            test3filehandle.write(i.rstrip("\n") + ' '+ j)
        k=int(k)
        k=k+1;
    
    0 讨论(0)
  • 2021-02-05 13:03
    import sys
    if len(sys.argv) !=3 :
      print "usage:" + sys.argv[0] + "   bla bla"
      exit
    elif len(sys.argv) == 3:
      file1 = set((x for x in open(sys.argv[1])))
      file2 = set((x for x in open(sys.argv[2])))
      file3 = file2.difference(file1)
      file4 = file1.difference(file2)
      str1="file1-contains but  file2 not \n"
      str2="file2-contains but  file1 not\n"
      FILE = open('file3','w')
      FILE.writelines(str2)
      FILE.writelines(file3)
      FILE.writelines(str1)
      FILE.writelines(file4)
    
    0 讨论(0)
  • 2021-02-05 13:04

    Comparing two text files in python?

    Sure, difflib makes it easy.

    Let's set up a demo:

    f1path = 'file1'
    f2path = 'file2'
    text1 = '\n'.join(['a', 'b', 'c', 'd', ''])
    text2 = '\n'.join(['a', 'ba', 'bb', 'c', 'def', ''])
    for path, text in ((f1path, text1), (f2path, text2)):
        with open(path, 'w') as f:
            f.write(text)
    

    Now to inspect a diff. The lines that use os and time are merely used to provide a decent timestamp for the last time your files were modified, and are completely optional, and are optional arguments to difflib.unified_diff:

    # optional imports:
    import os
    import time
    # necessary import:
    import difflib
    

    Now we just open the files, and pass a list of their lines (from f.readlines) to difflib.unified_diff, and join the list output with an empty string, printing the results:

    with open(f1path, 'rU') as f1:
        with open(f2path, 'rU') as f2:
            readable_last_modified_time1 = time.ctime(os.path.getmtime(f1path)) # not required
            readable_last_modified_time2 = time.ctime(os.path.getmtime(f2path)) # not required
            print(''.join(difflib.unified_diff(
              f1.readlines(), f2.readlines(), fromfile=f1path, tofile=f2path, 
              fromfiledate=readable_last_modified_time1, # not required
              tofiledate=readable_last_modified_time2, # not required
              )))
    

    which prints:

    --- file1       Mon Jul 27 08:38:02 2015
    +++ file2       Mon Jul 27 08:38:02 2015
    @@ -1,4 +1,5 @@
     a
    -b
    +ba
    +bb
     c
    -d
    +def
    

    Again, you can remove all the lines that are declared optional/not required and get the otherwise same results without the timestamp.

    redirect the different lines to a third file

    instead of printing, open a third file to write the lines:

            difftext = ''.join(difflib.unified_diff(
              f1.readlines(), f2.readlines(), fromfile=f1path, tofile=f2path, 
              fromfiledate=readable_last_modified_time1, # not required
              tofiledate=readable_last_modified_time2, # not required
              ))
            with open('diffon1and2', 'w') as diff_file:
                diff_file.write(difftext)
    

    and:

    $ cat diffon1and2
    --- file1       Mon Jul 27 11:38:02 2015
    +++ file2       Mon Jul 27 11:38:02 2015
    @@ -1,4 +1,5 @@
     a
    -b
    +ba
    +bb
     c
    -d
    +def
    
    0 讨论(0)
  • 2021-02-05 13:13

    check out difflib

    This module provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including HTML and context and unified diffs[...]

    A command-line example in http://docs.python.org/library/difflib.html#difflib-interface

    0 讨论(0)
提交回复
热议问题