distributed-computing

When do I use a consensus algorithm like Paxos vs using a something like a Vector Clock?

半世苍凉 提交于 2020-04-08 19:02:36
问题 I've been reading a lot about different strategies to guarantee consistency between nodes in distributed systems, but I'm having a bit of trouble figuring out when to use which algorithm. With what kind of system would I use something like a vector clock? Which system is ideal for using something like Paxos? Are the two mutually exclusive? 回答1: There's a distributed system of 2 nodes that store data. The data is replicated to both nodes so that if one node dies, the data is not lost

When do I use a consensus algorithm like Paxos vs using a something like a Vector Clock?

这一生的挚爱 提交于 2020-04-08 19:01:32
问题 I've been reading a lot about different strategies to guarantee consistency between nodes in distributed systems, but I'm having a bit of trouble figuring out when to use which algorithm. With what kind of system would I use something like a vector clock? Which system is ideal for using something like Paxos? Are the two mutually exclusive? 回答1: There's a distributed system of 2 nodes that store data. The data is replicated to both nodes so that if one node dies, the data is not lost

Is the CAP theorem a red herring?

假如想象 提交于 2020-03-19 19:32:23
问题 I am told that I have to give up transactional guarantees in large distributed systems because the CAP theorem says I can't have it. I think this is wrong for the following reasons: Internet routing is amazingly reliable. The CAP theorem only applies to network partitions where two groups of live machines can't communicate. Almost all real network partitions consist of catastrophic failures or cases where one of the partitions is very small and the other is very large and the small one can

Using Paxos to synchronize a large file across nodes

爱⌒轻易说出口 提交于 2020-01-24 17:29:09
问题 I'm trying to use Paxos to maintain consensus between nodes on a file that is around 50MB in size, and constantly being modified at individual nodes. I'm running into issues of practicality. Requirements: Sync a 50MB+ file across hundreds of nodes Have changes to this file, which can be made from any node, and aren't likely to directly compete with each other, propagated across the network in a few seconds at most New nodes that join the network can within a few minutes (<1 hour) build up the

How do I index codistributed arrays in a spmd block

感情迁移 提交于 2020-01-23 10:43:15
问题 I am doing a very large calculation (atmospheric absorption) that has a lot of individual narrow peaks that all get added up at the end. For each peak, I have pre-calculated the range over which the value of the peak shape function is above my chosen threshold, and I am then going line by line and adding the peaks to my spectrum. A minimum example is given below: X = 1:1e7; K = numel(a); % count the number of peaks I have. spectrum = zeros(size(X)); for k = 1:K grid = X >= rng(1,k) & X <= rng

Distributed Tensorflow Estimator execution does not trigger evaluation or export

徘徊边缘 提交于 2020-01-23 04:06:46
问题 I am testing distributed training with tensorflow Estimators. In my example I fit a simple sinus function with a custom estimator using tf.estimator.train_and_evaluation. After training and evaluation I want to export the model to have it ready for tensorflow serving. However the evaluation and export is only triggered when executing the estimator in non-distributed way. The model and Estimators are defined as follows: def my_model(features, labels, mode): # define simple dense network net =

Distributed Tensorflow Estimator execution does not trigger evaluation or export

喜夏-厌秋 提交于 2020-01-23 04:06:14
问题 I am testing distributed training with tensorflow Estimators. In my example I fit a simple sinus function with a custom estimator using tf.estimator.train_and_evaluation. After training and evaluation I want to export the model to have it ready for tensorflow serving. However the evaluation and export is only triggered when executing the estimator in non-distributed way. The model and Estimators are defined as follows: def my_model(features, labels, mode): # define simple dense network net =

How to add our custom library to Apache Spark?

孤街浪徒 提交于 2020-01-17 03:41:26
问题 I want to add GeoSpark library to Apache Spark. How do I add GeoSpark library from Spark shell? 回答1: $ ./bin/spark-shell --master local[4] --jars code.jar --jars option will distribute your local custom jar to cluster automatically. 来源: https://stackoverflow.com/questions/34367085/how-to-add-our-custom-library-to-apache-spark

Can anyone help me understand how MPI Communicator, Groups partitioning works? [closed]

浪子不回头ぞ 提交于 2020-01-16 20:19:33
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 4 years ago . Can anyone help me get my head around the MPI Groups, Inter and Intra communicators. I have already gone through the MPI documentation(http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf ) and I couldnt make good sense of these concepts. I would especially appreciate any code

Distributed Processing of Volumetric Image Data

本秂侑毒 提交于 2020-01-16 02:04:19
问题 For the development of an object recognition algorithm, I need to repeatedly run a detection program on a large set of volumetric image files (MR scans). The detection program is a command line tool. If I run it on my local computer on a single file and single-threaded it takes about 10 seconds. Processing results are written to a text file. A typical run would be: 10000 images with 300 MB each = 3TB 10 seconds on a single core = 100000 seconds = about 27 hours What can I do to get the