Pig vs Hive vs Native Map Reduce

后端未结

关注

 7  2095

无人及你 2020-12-14 01:55

I\'ve basic understanding on what Pig, Hive abstractions are. But I don\'t have a clear idea on the scenarios that require Hive, Pig or native map reduce.

I went thr

7条回答

时光说笑 (楼主)

2020-12-14 02:35
Scenarios where Hadoop Map Reduce is preferred to Hive or PIG
1. When you need definite driver program control
2. Whenever the job requires implementing a custom Partitioner
3. If there already exists pre-defined library of Java Mappers or Reducers for a job
4. If you require good amount of testability when combining lots of large data sets
5. If the application demands legacy code requirements that command physical structure
6. If the job requires optimization at a particular stage of processing by making the best use of tricks like in-mapper combining
7. If the job has some tricky usage of distributed cache (replicated join), cross products, groupings or joins
Pros of Pig/Hive :
1. Hadoop MapReduce requires more development effort than Pig and Hive.
2. Pig and Hive coding approaches are slower than a fully tuned Hadoop MapReduce program.
3. When using Pig and Hive for executing jobs, Hadoop developers need not worry about any version mismatch.
4. There is very limited possibility for the developer to write java level bugs when coding in Pig or Hive.
Have a look at this post for Pig Vs Hive comparison.
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...