Erlang: create filewatcher

后端 未结 3 1496
鱼传尺愫
鱼传尺愫 2020-12-30 16:23

I have to implement file watcher functionality in Erlang: There should be a process that list files if specific directory and do something, when files appear.

I take

相关标签:
3条回答
  • 2020-12-30 16:26

    I have written such a library, based on polling. (It would be nice to extend it to use inotify on platforms where this is supported.) It was originally meant to be used in EUnit, but I turned into a separate project instead. You can find it here:

    https://github.com/richcarl/file_monitor

    0 讨论(0)
  • 2020-12-30 16:31

    if you are using Linux, you can use inotify. It is a kernel service that lets you subscribe to file system events. Don't poll the filesystem, let the filesystem call you.

    you can try https://github.com/massemanet/inotify for observing your directory.

    Ulf

    0 讨论(0)
  • 2020-12-30 16:35

    In Erlang it is very cheap to create processes (orders of magnitudes compared to other systems).

    Therefore I recommend to create a new ProcessFileServer each time a new file to process is appearing. When it is done with just terminate the process with exit reason normal.

    I would suggest the following structure:

                                  top_supervisor
                                          |
                  +-----------------------+-------------------------+
                  |                                                 |
           directory_supervisor                             processing_supervisor
                   |                                         simple_one_for_one
        +----------+-----...-----+                                   |
        |          |             |                       starts children transient
        |          |             |                                   |
    dir_watcher_1 dir_watcher_2 dir_watcher_n   +-------------+------+---...----+
                                                |             |                 |
                                            proc_file_1   proc_file_2       proc_file_n
    

    When a dir_watcher notices a new file appeared. It calls the processing_supervisors supervisor:start_child\2 function, with the extra parameter of the file pathe e.g.

    The processing_supervisor should start its children with transient restart policy.

    So if one of the proc_file servers is crashing it will be restarted, but when they terminate with exit reason normal they are not restarted. So you just exit normal when done and crash when whatever else happens.

    If you don't overdo it, cyclic polling for files is Ok. If the system becomes loaded because of this polling you can investigate in kernel notification systems (e.g. FreeBSD KQUEUE or the higher level services building upon it on MacOSX) to send you a message when a file appears in a directory. These services however have a complexity because it is necessary for them to throw up their hands if too many events happen (otherwise they wouldn't be a performance improvement but the opposite). So you will have to have a robust polling solution as a fallback anyway.

    So don't do premature optimization and start with polling, adding improvements (which would be isolated in the dir_watcher servers) when it gets necessary.


    Regarding the comment what behaviour to use as dir_watcher process since it doesn't use much of gen_servers functionality:

    • There is no problem with only using part of gen_servers posibilities, in fact it is very common not to use all of it. In your case you only set up a timer in init and use handle_info to do your work. The rest of the gen_server is just the unchanged template.

    • If you later want changing parameters like poll frequency it is easy to add into this.

    • gen_fsm is much less used since it only fits a quite limited model and is not very flexible. I use it only when it really fits 100% to the requirement (which it does almost never).

    • In a case where you just want a simple plain Erlang server you can use the spawn functions in proc_lib to get just the minimal functionality to run under a supervisor.

    • A interesting way to write more natural Erlang code and still have the OTP advantages is plain_fsm, here you have the advantages of selective receive and flexible message handling needed especially when handling protocols paired with the nice features of OTP.

    Having said all this: if I would write a dir_watcher I'd just use a gen_server and use only what I need. The unused functionality doesn't really cost you anything and everybody understands what it does.

    0 讨论(0)
提交回复
热议问题