distributed | 易学教程

GUI recommandations for eventual consistency?

阅读更多关于 GUI recommandations for eventual consistency?

问题 When using distributed and scalable architecture, eventual consistency is often a requirement. Graphically, how to deal with this eventual consistency? Users are used to click save, and see the result instantaneously... with eventual consistency it's not possible. How to deal with the GUI for such scenarios? Please note the question applies both for desktop applications and web applications. PS: I'm working with the Microsoft platform, but I imagine the question applies to any technology...

Spread vs MPI vs zeromq?

阅读更多关于 Spread vs MPI vs zeromq?

问题 In one of the answers to Broadcast like UDP with the Reliability of TCP, a user mentions the Spread messaging API. I've also run across one called ØMQ. I also have some familiarity with MPI. So, my main question is: why would I choose one over the other? More specifically, why would I choose to use Spread or ØMQ when there are mature implementations of MPI to be had? 回答1: MPI was deisgned tightly-coupled compute clusters with fast, reliable networks. Spread and ØMQ are designed for large

How to import column value in Cassandra like one having such values “13/01/09 23:13”?

阅读更多关于 How to import column value in Cassandra like one having such values “13/01/09 23:13”?

问题 Query: CREATE TABLE IF NOT EXISTS "TEMP_tmp".temp ( "Date_Time" timestamp, PRIMARY KEY ("Date_Time") ); CSV Contains "13/01/09 23:13" values. Error : Failed to import 1 rows: ParseError - Failed to parse 13/01/09 23:13 : invalid literal for long() with base 10: '13/01/09 23:13', given up without retries. What Data Type should I Use ? 回答1: Default Cqlsh timestamp format is : year-month-day hour:min:sec+timezone Example : 2017-02-01 05:28:36+0000 You either change your date format to above or

What would be the best way to match up two object instances between two different applications in a J2EE server?

阅读更多关于 What would be the best way to match up two object instances between two different applications in a J2EE server?

问题 I have a J2ee application where I basically want two objects, created by two separate servlets to communicate directly and I need these intances to be stable, i.e. to "know" each other during the session. The sequence is roughly: Client sends a request to Servlet #1, which creates object A Client sends a second request (after the first returns) to servlet #2 which creates object B. Object B finds A, using JNDI, and the two objects interact. The client now continues to send requests to object

How can i create system for distributed calculations?

阅读更多关于 How can i create system for distributed calculations?

问题 I am a student of faculty of Cybernetics and I want to write one project using Java. I want to create system for distributed computing. It will contains next components: 1. User's main program (different for each concrete situation) 2. User's task program, that can only solve some little task (also different for each case) 3. My program, that will interact with user's main program to know, which tasks are needed to be solved 4. My program, that will interact with user's task program to tell

Distributed JMS based logging .. falling flat?

阅读更多关于 Distributed JMS based logging .. falling flat?

问题 In our fancy ESB, logging of each request is done via a common infrastructure based on JMS based logging. Here is what happens in a nutshell: service gets a request service prepares some data in a LogData object service calls database time taken for db interaction is captured in LogData object service is ready to send response LogData object is sent to a messaging destination service sends response Very rosey! yes for paper architects. Here is the actual issue: The JMS service provider

How to find why a task fails in dask distributed?

阅读更多关于 How to find why a task fails in dask distributed?

问题 I am developing a distributed computing system using dask.distributed . Tasks that I submit to it with the Executor.map function sometimes fail, while others seeming identical, run successfully. Does the framework provide any means to diagnose problems? update By failing I mean increasing counter of failed tasks in the Bokeh web UI, provided by the scheduler. Counter of finished tasks increases too. Function that is run by the Executor.map returns None . It communicates to a database,

Distributed TensorFlow [Async, Between-Graph Replication]: which are the exactly interaction between workers and servers regarding Variables update

阅读更多关于 Distributed TensorFlow [Async, Between-Graph Replication]: which are the exactly interaction between workers and servers regarding Variables update

问题 I've read Distributed TensorFlow Doc and this question on StackOverflow but I still have some doubt about the dynamics behind the distributed training that can be done with TensorFlow and its Parameter Server Architecture. This is a snipped of code from the Distributed TensorFlow Doc: if FLAGS.job_name == "ps": server.join() elif FLAGS.job_name == "worker": # Assigns ops to the local worker by default. with tf.device(tf.train.replica_device_setter( worker_device="/job:worker/task:%d" % FLAGS

NoSql With My Own Custom Binary Files?

阅读更多关于 NoSql With My Own Custom Binary Files?

问题 Originally, I had to deal with just 1.5[TB] of data. Since I just needed fast write/reads (without any SQL), I designed my own flat binary file format (implemented using python ) and easily (and happily) saved my data and manipulated it on one machine. Of course, for backup purposes, I added 2 machines to be used as exact mirrors (using rsync ). Presently, my needs are growing, and there's a need to build a solution that would successfully scale up to 20[TB] (and even more) of data. I would

Spark 1.0.2 (also 1.1.0) hangs on a partition

阅读更多关于 Spark 1.0.2 (also 1.1.0) hangs on a partition

问题 I've got a weird problem in apache spark and I would appreciate some help. After reading data from hdfs (and doing some conversion from json to object) the next stage (processing said objects) fails after 2 partitions have been processed (out of 512 in total). This happens on large-ish datasets (the smallest I have noticed is about 700 megs, but could be lower, I haven't narrowed it down yet). EDIT: 700 megs is the tgz file size, uncompressed it's 6 gigs. EDIT 2: The same thing happens on