Comparing two directories using Perl

后端 未结 3 1239
太阳男子
太阳男子 2021-01-16 05:40

i am new to Perl so excuse my noobness,

Here\'s what i intend to do.

$ perl dirComp.pl dir1 dir2

dir1 & dir2 are directory nam

3条回答
  •  孤城傲影
    2021-01-16 05:58

    You might want to try the ol' File::Find. It's not my favorite module. (It is just funky in the way it works), but for your purposes, it allows you to easily find all files in two directories, and compare them. Here's a brief example:

    use strict;
    use warnings;
    use feature qw(say);
    use Digest::MD5::File qw(file_md5_hex);
    
    use File::Find;
    
    use constant {
        DIR_1 => "/usr/foo",
        DIR_2 => "/usr/bar",
    };
    
    my %dir_1;
    my %dir_2;
    
    find ( sub {
            if ( -f $File::Find::name ) {
                $dir_1{$File::Find::name} = file_md5_hex($File::Find::name);
            }
            else {
                $dir_1($file::Find::name} = "DIRECTORY!";
            }
        }, DIR_1);
    
    find ( sub {
            if ( -f $File::Find::name ) {
                $dir_2{$File::Find::name} = file_md5_hex($File::Find::name);
            }
            else {
                $dir_2($file::Find::name} = "DIRECTORY!";
            }
        }, DIR_2);
    

    This will create two hashes keyed by the file names in each directory. I used the Digest::MD5::File to create a MD5 checksum. If the checksum between the two files differ, I know the files differ (although I don't know where).

    Now you have to do three things:

    1. Go through %dir_1 and see if there's an equivalent key in %dir_2. If there is not an equivalent key, you know that a file exists in %dir_1 and not %dir_2.
    2. If there an equivalent key in each hash, check to see if the md5 checksums agree. If they do, then, the files match. If they don't they differ. You can't say where they differ, but they differ.
    3. Finally, go through %dir_2 and check to see if there's an equivalent key in %dir_1. If there is, do nothing. If there isn't, that means there's a file in %dir_1 that's not in %dir_2.

    Just a word of warning: The keys int these two hashes won't match. You'll have to transform one to the other when doing your compare. For example, you'll have two files as:

    /usr/bar/my/file/is/here.txt
    /usr/foo/my/file/is/here.txt
    

    As you can see, my/file/is/here.txt exist in both directories, but in my code, the two hashes will have two different keys. You could either fix the two subroutines to strip the directory name off the front of the files paths, or when you do your comparison, transform one to the other. I didn't want to run through a full test. (The bit of code I wrote works in my testing), so I'm not 100% sure what you'll have to do to make sure you find the matching keys.

    Oh, another warning: I pick up all entries and not just files. For directories, I can check to see if the hash key is equal to DIRECTORY! or not. I could simply ignore everything that's not a file.

    And, you might want to check for special cases. Is this a link? Is it a hard link or a soft link? What about some sort of special file. That makes things a bit more complex. However, the basics are here.

提交回复
热议问题