Restarting erlang process and preserving state

后端未结

关注

 3  1285

I have a supervisor process which starts number of child processes. Currently when the child dies I spawn a new process with new Pid. This means I loose the state information of

相关标签:

3条回答

渐次进展

2021-02-06 07:28

Storing process state in an ets table would work for keeping your state around between crashes, and I usually use the global registry for giving processes persistent names. (Player 200 would be registered as {player, 200}.) I don't recommend using the local registry because it requires that you use atoms and if you have many child processes, you can chew up your limit of atoms in a hurry by creating them dynamically (like player_200, player_201, etc.)

Storing child state in the ets table has its own risks and issues, though. If a child crashes between the moment when an error occurs and when it saves to the ets table, you should be alright. However, what if you process data that causes the child to save garbage state, then crash on processing the next message? You'll restart the process, load the bad state from the ets table, and crash on your next message again. There are certainly ways to deal with this, but you should be aware that it is a possibility and work around it.

While Erlang hides the problems of distributing an ets table to all processes, it does so at the cost of CPU and potential contentions. If you're pushing a lot of changes to your ets table, you're going to pay for it in performance.

If your children are crashing, shouldn't you be looking for a way for them to remove the erroneous conditions, anyway? I would usually take a process crash as something that I needed to root cause and fix. ?

0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2021-02-06 07:35

Using ETS tables is probably the way to go for keeping the state. Vinoski's article discusses how to make it possible to restart a crashed process while keeping the ETS table data.

As @user30997 points out the data in the table may actually be the reason the process crashed, so on restart you might want to validate the table (or set a limit on how many times the process will be restarted...)

For associating processes with id's you should take a look at gproc which is great for this.

0 讨论(0)
发布评论:

提交评论
- 加载中...
日久生厌

2021-02-06 07:42

Use eventsourcing, persist all events, and replay back to reconstruct the state. In case you need fast replays, make a snapshot. The example below: https://github.com/bryanhunter/cqrs-with-erlang/tree/ndc-oslo

In fact, it would be nice to build a complete framework based on this example.

0 讨论(0)
发布评论:

提交评论
- 加载中...