可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am newer to Hadoop, and want to know what is the differences between Hadoop-common, Hadoop-core and Hadoop-client?
By the way,for a given class, how do I know which artifact contains it in Maven ? For example, which one contains the org.apache.hadoop.io.Text?
回答1:
To help provide some additional details regarding the differences between Hadoop-common, Hadoop-core and Hadoop-client, from a high-level perspective:
- Hadoop-common refers to the commonly used utilities and libraries that support the Hadoop modules.
- Hadoop-core is the same as Hadoop-common; It was renamed to Hadoop-common in July 2009, per https://hadoop.apache.org/.
- Hadoop-client refers to the client libraries used to communicate with Hadoop's common components (HDFS, MapReduce, YARN) including but not limited to logging and codecs for example.
Generally speaking, for developers who build apps that submit to YARN, run a MR job, or access files from HDFS use Hadoop-client libraries.
回答2:
In order to build a Hadoop map-reduce application you need only hadoop client dependency. (Use new API). Dependencies like hadoop-hdfs,hadoop-common,hadoop-clientapp,hadoop-yarn-api are resolved from this.
回答3:
From techopedia
Hadoop Common
refers to the collection of common utilities and libraries that support other Hadoop modules. It is an essential part or module of the Apache Hadoop Framework, along with the Hadoop Distributed File System (HDFS), Hadoop YARN and Hadoop MapReduce.
Like all other modules, Hadoop Common
assumes that hardware failures are common and that these should be automatically handled in software by the Hadoop Framework.
Hadoop Common
is also known as Hadoop Core
.
Hadoop Client libraries
helps to load data into the cluster, submit Map Reduce jobs describing how that data should be processed, and then retrieve or view the results of the job when its finished. Have a look at this article
This Apache link provides the list of dependencies of Hadoop Client library.