Restarting erlang process and preserving state

后端 未结 3 1285
清酒与你
清酒与你 2021-02-06 07:01

I have a supervisor process which starts number of child processes. Currently when the child dies I spawn a new process with new Pid. This means I loose the state information of

相关标签:
3条回答
  • 2021-02-06 07:28

    Storing process state in an ets table would work for keeping your state around between crashes, and I usually use the global registry for giving processes persistent names. (Player 200 would be registered as {player, 200}.) I don't recommend using the local registry because it requires that you use atoms and if you have many child processes, you can chew up your limit of atoms in a hurry by creating them dynamically (like player_200, player_201, etc.)

    Storing child state in the ets table has its own risks and issues, though. If a child crashes between the moment when an error occurs and when it saves to the ets table, you should be alright. However, what if you process data that causes the child to save garbage state, then crash on processing the next message? You'll restart the process, load the bad state from the ets table, and crash on your next message again. There are certainly ways to deal with this, but you should be aware that it is a possibility and work around it.

    While Erlang hides the problems of distributing an ets table to all processes, it does so at the cost of CPU and potential contentions. If you're pushing a lot of changes to your ets table, you're going to pay for it in performance.

    If your children are crashing, shouldn't you be looking for a way for them to remove the erroneous conditions, anyway? I would usually take a process crash as something that I needed to root cause and fix. ?

    0 讨论(0)
  • 2021-02-06 07:35

    Using ETS tables is probably the way to go for keeping the state. Vinoski's article discusses how to make it possible to restart a crashed process while keeping the ETS table data.

    As @user30997 points out the data in the table may actually be the reason the process crashed, so on restart you might want to validate the table (or set a limit on how many times the process will be restarted...)

    For associating processes with id's you should take a look at gproc which is great for this.

    0 讨论(0)
  • 2021-02-06 07:42

    Use eventsourcing, persist all events, and replay back to reconstruct the state. In case you need fast replays, make a snapshot. The example below: https://github.com/bryanhunter/cqrs-with-erlang/tree/ndc-oslo

    In fact, it would be nice to build a complete framework based on this example.

    0 讨论(0)
提交回复
热议问题