Compare two files for differences in python

前端 未结 2 854
轻奢々
轻奢々 2021-01-26 14:23

I want to compare two files (take line from first file and look up in whole second file) to see differences between them and write missing line from fileA.txt to end of fileB.tx

相关标签:
2条回答
  • 2021-01-26 14:43

    read in two files and convert to set

    find union of two sets
    sort union set based on time
    join set to string with new line

    import datetime
    import 
    file1 = "fileA.txt"
    file2 = "fileB.txt"
    
    with open(file1 ,'rb') as f:
      sa = set( line for line in f )
    with open(file2 ,'rb') as f:
      sb = set( line for line in f )
    print '\n'.join( sorted( sa.union(sb), key = lambda x: datetime.datetime.strptime( ' '.join( x.split()[:3]), '%b %d %H:%M:%S' )) )
    
    
    
    Oct  9 12:19:16 user sshd[12744]: pam_unix(sshd:session): session opened for user root by (uid=0)
    Oct  9 12:19:16 user sshd[12744]: Accepted password for root from 213.XXX.XXX.XX7 port 60554 ssh2
    Oct  9 13:24:42 user sshd[12744]: pam_unix(sshd:session): session closed for user root
    Oct  9 13:24:42 user sshd[12744]: Received disconnect from 213.XXX.XXX.XX7: 11: disconnected by user
    Oct  9 13:25:31 user sshd[12844]: Accepted password for root from 213.XXX.XXX.XX7 port 33254 ssh2
    Oct  9 13:25:31 user sshd[12844]: pam_unix(sshd:session): session opened for user root by (uid=0)
    Oct  9 13:35:48 user sshd[12868]: Accepted password for root from 213.XXX.XXX.XX7 port 33574 ssh2
    Oct  9 13:35:48 user sshd[12868]: pam_unix(sshd:session): session opened for user root by (uid=0)
    Oct  9 13:46:58 user sshd[12844]: pam_unix(sshd:session): session closed for user root
    Oct  9 13:46:58 user sshd[12844]: Received disconnect from 213.XXX.XXX.XX7: 11: disconnected by user
    Oct  9 15:47:58 user sshd[12868]: pam_unix(sshd:session): session closed for user root
    Oct 11 22:17:31 user sshd[2655]: pam_unix(sshd:session): session opened for user root by (uid=0)
    Oct 11 22:17:31 user sshd[2655]: Accepted password for root from 17X.XXX.XXX.X19 port 5567 ssh2
    
    0 讨论(0)
  • 2021-01-26 14:51

    Try with this in the bash:

    cat fileA.txt fileB.txt | sort -M | uniq > new_file.txt
    

    sort -M: sorts based on initial string, consisting of any amount of whitespace, followed by a month name abbreviation, is folded to UPPER case and compared in the order 'JAN' < 'FEB' < ... < 'DEC'. Invalid names compare low to valid names. The `LC_TIME' locale determines the month spellings.

    uniq: filters out repeated lines in a file.

    |: passes the output of one command to another for further processing.

    What this will do is take the two files, sort them in the way described above, keep the unique items and store them in new_file.txt

    Note: This is not a python solution but you have tagged the question with linux so I thought it might interest you. Also you can find more detailed info about the commands used, here.

    0 讨论(0)
提交回复
热议问题