Where do I start with distributed computing?

前端 未结 7 1236
星月不相逢
星月不相逢 2021-01-31 06:18

I\'m interested in learning techniques for distributed computing. As a Java developer, I\'m probably willing to start with Hadoop. Could you please recommend some books/tutorial

相关标签:
7条回答
  • 2021-01-31 06:40

    Hadoop is not necessarily the best tool for all distributed computing problems. Despite its power, it also has a pretty steep learning curve and cost of ownership. You might want to clarify your requirements and look for suitable alternatives in the Java world, such as HTCondor, JPPF or GridGain (my apologies to those I do not mention).

    0 讨论(0)
  • 2021-01-31 06:40

    If you are looking to learn a distributed computing platform that is less complicated than Hadoop you can try Zillabyte. You only need to know some Ruby or Python to build apps on the platform.

    As LoLo said, Hadoop is a powerful solution, but can be rough to start with.

    For materials to learn about distributed computing try http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-824-distributed-computer-systems-engineering-spring-2006/syllabus/. There are several resources recommended by the course as well.

    0 讨论(0)
  • 2021-01-31 06:42

    The All Things Hadoop Podcast http://allthingshadoop.com/podcast has some good content and good guests. A lot of it is geared to getting started with Distributed Computing.

    0 讨论(0)
  • 2021-01-31 06:43

    MIT 6.824 is the best stuff. Only reading google papers related to Hadoop is not enough. A systematic course learning is required if you want to go deeper.

    0 讨论(0)
  • 2021-01-31 06:53

    Maybe you can read some papers related to MapReduce and distributed computing first, to gain a better understanding of it. Here are some I would like to recommand:

    • MapReduce: Simplified Data Processing on Large Clusters, http://www.usenix.org/events/osdi04/tech/full_papers/dean/dean_html/

    • Bigtable: A Distributed Storage System for Structured Data, http://www.usenix.org/events/osdi06/tech/chang/chang_html/

    • Dryad: Distributed data-parallel programs from sequential building blocks, http://pdos.csail.mit.edu/6.824-2007/papers/isard-dryad.pdf

    • The landscape of parallel computing research: A view from berkeley, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.67.8705&rep=rep1&type=pdf

    On the other hand, if you want to know better of Hadoop, maybe you can start reading Hadoop MapReduce framework source code.

    0 讨论(0)
  • 2021-01-31 06:54

    Here are some resources from Yahoo! Developer Network

    a tutorial:

    http://developer.yahoo.com/hadoop/tutorial/

    an introductory course (requires Siverlight, sigh):

    http://yahoo.hosted.panopto.com/CourseCast/Viewer/Default.aspx?id=281cbf37-eed1-4715-b158-0474520014e6

    0 讨论(0)
提交回复
热议问题