fault-tolerance

Are Erlang/OTP messages reliable? Can messages be duplicated?

痴心易碎 提交于 2019-12-03 00:42:04
问题 Long version: I'm new to erlang, and considering using it for a scalable architecture. I've found many proponents of the platform touting its reliability and fault tolerance. However, I'm struggling to understand exactly how fault-tolerance is achieved in this system where messages are queued in transient memory. I understand that a supervisor hierarchy can be arranged to respawn deceased processes, but I've been unable to find much discussion of the implications of respawning on works-in

Building a fault-tolerant soft real-time web application with Erlang/OTP

蓝咒 提交于 2019-11-30 12:33:49
问题 I would like to build a fault-tolerant soft real-time web application for a pizza delivery shop. It should help the pizza shop to accept phone calls from customers, put them as orders into the system (via a CRM web client) and help the dispatchers to assign delivery drivers to the orders. These goals are nothing unusual, but I would like to make the service available 24/7, i.e. to make it fault-tolerant. Moreover, I would like to make it work very fast and to be very responsive. Below is a

fault tolerance in MPICH/OpenMPI

核能气质少年 提交于 2019-11-30 08:17:49
问题 I have two questions- Q1 . Is there a more efficient way to handle the error situation in MPI, other than check-point/rollback? I see that if a node "dies", the program halts abruptly.. Is there any way to go ahead with the execution after a node dies ?? (no issues if it is at the cost of accuracy) Q2 . I read in "http://stackoverflow.com/questions/144309/what-is-the-best-mpi-implementation", that OpenMPI has better fault tolerance and recently MPICH-2 has also come up with similar features..

Building a fault-tolerant soft real-time web application with Erlang/OTP

≡放荡痞女 提交于 2019-11-30 02:27:31
I would like to build a fault-tolerant soft real-time web application for a pizza delivery shop. It should help the pizza shop to accept phone calls from customers, put them as orders into the system (via a CRM web client) and help the dispatchers to assign delivery drivers to the orders. These goals are nothing unusual, but I would like to make the service available 24/7, i.e. to make it fault-tolerant. Moreover, I would like to make it work very fast and to be very responsive. Below is a very simple architecture view for such an application. The problem is that I do not know how to use all

fault tolerance in MPICH/OpenMPI

只谈情不闲聊 提交于 2019-11-29 06:30:00
I have two questions- Q1 . Is there a more efficient way to handle the error situation in MPI, other than check-point/rollback? I see that if a node "dies", the program halts abruptly.. Is there any way to go ahead with the execution after a node dies ?? (no issues if it is at the cost of accuracy) Q2 . I read in "http://stackoverflow.com/questions/144309/what-is-the-best-mpi-implementation", that OpenMPI has better fault tolerance and recently MPICH-2 has also come up with similar features.. does anybody know what they are and how to use them? is it a "mode"? can they help in the situation

What's up with the [OptionalField] Attribute?

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-29 03:09:35
问题 As I understand it I have to adorn a new member in a newer version of my class with the [OptionalField] Attribute when I deserialize an older version of my class that lacks this newer member. However, the code below throws no exception while the InnerTranslator property was added after serializing the class. I check for the property to be null in the onDeserialization method (which confirms that it was not serialized),but I would have expected the code to throw an exception because of that.

Compiling an application for use in highly radioactive environments

谁都会走 提交于 2019-11-27 16:32:55
We are compiling an embedded C/C++ application that is deployed in a shielded device in an environment bombarded with ionizing radiation . We are using GCC and cross-compiling for ARM. When deployed, our application generates some erroneous data and crashes more often than we would like. The hardware is designed for this environment, and our application has run on this platform for several years. Are there changes we can make to our code, or compile-time improvements that can be made to identify/correct soft errors and memory-corruption caused by single event upsets ? Have any other developers