Avoid detecting incomplete files when watching a directory for changes in java

前端 未结 5 662
庸人自扰
庸人自扰 2020-12-31 11:34

I am watching a directory for incoming files (using FileAlterationObserver from apache commons).

class Example implements FileAlterationListener {
    public         


        
相关标签:
5条回答
  • 2020-12-31 11:44

    You can check the size of the file 2 or more times in a couple of seconds and if the size is not changing, then you can decide the file change has completed and proceed with your own execution.

    0 讨论(0)
  • 2020-12-31 11:48

    If you use FileAlterationListener and add a FileAlterationListenerAdaptor you can implement the methods you need and monitor the files with a FileAlterationMonitor ...

    public static void main( String[] args ) throws Exception {
    
        FileAlterationObserver fao = new FileAlterationObserver( dir );
        final long interval = 500;
        FileAlterationMonitor monitor = new FileAlterationMonitor( interval );
        FileAlterationListener listener = new FileAlterationListenerAdaptor() {
    
            @Override
            public void onFileCreate( File file ) {
                try {
                    System.out.println( "File created: " + file.getCanonicalPath() );
                } catch( IOException e ) {
                    e.printStackTrace( System.err );
                }
            }
    
            @Override
            public void onFileDelete( File file ) {
                try {
                    System.out.println( "File removed: " + file.getCanonicalPath() );
                } catch( IOException e ) {
                    e.printStackTrace( System.err );
                }
            }
    
            @Override
            public void onFileChange( File file ) {
                try {
                    System.out.println( file.getName() + " changed: ");
                } catch( Exception e ) {
                    e.printStackTrace();
                } 
            }
        };
        // Add listeners...
        fao.addListener( listener );
        monitor.addObserver( fao );
        monitor.start();
    }
    
    0 讨论(0)
  • 2020-12-31 11:56

    I had a similar problem. At first I thought I could use the FileWatcher service, but it doesn't work on remote volumes, and I had to monitor incoming files via a network mounted drive.

    Then I thought I could simply monitor the change in file size over a period of time and consider the file done once the file size had stabilized (as fmucar suggested). But I found that in some instances on large files, the hosting system would report the full size of the file it was copying, rather than the number of bytes it had written to disk. This of course made the file appear stable, and my detector would catch the file while it was still in the process of being written.

    I eventually was able to get the monitor to work, by employing a FileInputStream exception, which worked wonderfully in detecting whether a file was being written to, even when the file was on a network mounted drive.

          long oldSize = 0L;
          long newSize = 1L;
          boolean fileIsOpen = true;
    
          while((newSize > oldSize) || fileIsOpen){
              oldSize = this.thread_currentFile.length();
              try {
                Thread.sleep(2000);
              } catch (InterruptedException e) {
                e.printStackTrace();
              }
              newSize = this.thread_currentFile.length();
    
              try{
                  new FileInputStream(this.thread_currentFile);
                  fileIsOpen = false;
              }catch(Exception e){}
          }
    
          System.out.println("New file: " + this.thread_currentFile.toString());
    
    0 讨论(0)
  • 2020-12-31 12:05

    A generic solution to this problem seems impossible from the "consumer" end. The "producer" may temporarily close the file and then resume appending to it. Or the "producer" may crash, leaving an incomplete file in the file system.

    A reasonable pattern is to have the "producer" write to a temp file that's not monitored by the "consumer". When it's done writing, rename the file to something that's actually monitored by the "consumer", at which point the "consumer" will pick up the complete file.

    0 讨论(0)
  • 2020-12-31 12:08

    I don't think you can achieve what you want unless you have some file system constraints and guarantees. For example, what if you have the following scenario :

    • File X created
    • A bunch of change events are triggered that correspond with writing out of file X
    • A lot of time passes with no updates to file X
    • File X is updated.

    If file X cannot be updated after it's written out, you can have a thread of execution that calculates the elapsed time from the last update to now, and after some interval decides that the file write is complete. But even this has issues. If the file system is hung, and the write does not occur for some time, you could erroneously conclude that the file is finished writing out.

    0 讨论(0)
提交回复
热议问题