fault-tolerance

How is the detection of terminated nodes in Erlang working? How is net_ticktime influencing the control of node liveness in Erlang?

亡梦爱人 提交于 2019-12-12 10:57:02
问题 I set net_ticktime value to 600 seconds. net_kernel:set_net_ticktime(600) In Erlang documentation for net_ticktime = TickTime: Specifies the net_kernel tick time. TickTime is given in seconds. Once every TickTime/4 second, all connected nodes are ticked (if anything else has been written to a node) and if nothing has been received from another node within the last four (4) tick times that node is considered to be down. This ensures that nodes which are not responding, for reasons such as

Is it not possible to make a C++ application “Crash Proof”?

拜拜、爱过 提交于 2019-12-12 10:47:11
问题 Let's say we have an SDK in C++ that accepts some binary data (like a picture) and does something. Is it not possible to make this SDK "crash-proof"? By crash I primarily mean forceful termination by the OS upon memory access violation, due to invalid input passed by the user (like an abnormally short junk data). I have no experience with C++, but when I googled, I found several means that sounded like a solution (use a vector instead of an array, configure the compiler so that automatic

What assumptions can I make about global time on Azure?

蓝咒 提交于 2019-12-11 20:39:01
问题 I want my Azure role to reprocess data in case of sudden failures. I consider the following option. For every block of data to process I have a database table row and I could add a column meaning "time of last ping from a processing node". So when a node grabs a data block for processing it sets "processing" state and that time to "current time" and then it's the node responsibility to update that time say every one minute. Then periodically some node will ask for "all blocks that have

Good scalable fault-tolerant in-memory database with LINQ support for .NET

一世执手 提交于 2019-12-11 01:45:43
问题 Are there are good in-memory transactional databases that support LINQ and SQL Server persistance? I'd like to create a full representation of a large data store in memory and have it commit to a SQL Server Database in a lazy fashion, but still keep some level of fault tolerance by scaling it out horizontally. I don't want to rely on non-relational datagrams like CouchDB. 回答1: SQLite supports in-memory databases has transaction support and has a Linq provider as well. As for the SQL Server

quartz jobDetail requestRecovery

旧巷老猫 提交于 2019-12-06 23:00:26
问题 The documentation for JobDetail.requestsRecovery property states the following Instructs the Scheduler whether or not the Job should be re-executed if a 'recovery' or 'fail-over' situation is encountered. Now, what is a 'recovery' situation or a 'fail-over' situation? How are they different? Does the recovery happen only if the JVM crashes during job execution or does it happen if the job execution fails because of an exception also? 回答1: A " Recovery situation " is the generic term, one kind

How to configure fault tolerance programmatically for a spring tasklet (not a chunk)

一曲冷凌霜 提交于 2019-12-06 02:21:54
问题 Programmatically configuring fault tolerance for a chunk works kind of as follows: stepBuilders.get("step") .<Partner,Partner>chunk(1) .reader(reader()) .processor(processor()) .writer(writer()) .listener(logProcessListener()) .faultTolerant() .skipLimit(10) .skip(UnknownGenderException.class) .listener(logSkipListener()) .build(); The trick is, that with adding "chunk", the chain switches to a SimpleStepBuilder which offers the "faultTolerant" method. My question is how to do that if you

How to discover that a Scala remote actor is died?

霸气de小男生 提交于 2019-12-05 20:48:43
问题 In Scala, an actor can be notified when another (remote) actor terminates by setting the trapExit flag and invoking the link() method with the second actor as parameter. In this case when the remote actor ends its job by calling exit() the first one is notified by receiving an Exit message. But what happens when the remote actor terminates in a less graceful way (e.g. the VM where it is running crashes)? In other words, how the local actor can discover that the remote one is no longer

quartz jobDetail requestRecovery

萝らか妹 提交于 2019-12-05 04:19:44
The documentation for JobDetail.requestsRecovery property states the following Instructs the Scheduler whether or not the Job should be re-executed if a 'recovery' or 'fail-over' situation is encountered. Now, what is a 'recovery' situation or a 'fail-over' situation? How are they different? Does the recovery happen only if the JVM crashes during job execution or does it happen if the job execution fails because of an exception also? zerologiko A " Recovery situation " is the generic term, one kind of recovery situation is the " fail-over ". A fail-over is a process used by fault-tolerance

Handling Faults in Akka actors

与世无争的帅哥 提交于 2019-12-04 11:05:38
问题 I've a very simple example where I've an Actor ( SimpleActor ) that perform a periodic task by sending a message to itself. The message is scheduled in the constructor for the actor. In the normal case (i.e., without faults) everything works fine. But what if the Actor has to deal with faults. I've another Actor ( SimpleActorWithFault ). This actor could have faults. In this case, I'm generating one myself by throwing an exception. When a fault happens (i.e., SimpleActorWithFault throws an

How to configure fault tolerance programmatically for a spring tasklet (not a chunk)

杀马特。学长 韩版系。学妹 提交于 2019-12-04 07:12:54
Programmatically configuring fault tolerance for a chunk works kind of as follows: stepBuilders.get("step") .<Partner,Partner>chunk(1) .reader(reader()) .processor(processor()) .writer(writer()) .listener(logProcessListener()) .faultTolerant() .skipLimit(10) .skip(UnknownGenderException.class) .listener(logSkipListener()) .build(); The trick is, that with adding "chunk", the chain switches to a SimpleStepBuilder which offers the "faultTolerant" method. My question is how to do that if you just have a tasklet (no reader, processor, writer)? Defining a tasklet works as follows: stepBuilders.get(