I have a managed Hive table, which contains only one 150MB file. I then do \"select count(*) from tbl\" to it, and it uses 2 mappers. I want to set it to a bigger number.
Try adding the following:
set hive.merge.mapfiles=false;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
I combined @javadba 's answer with that I received from Hive mailing list, here's the solution:
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set mapred.map.tasks = 20;
select count(*) from dw_stage.st_dw_marketing_touch_pi_metrics_basic;
From the mailing list:
It seems that HIVE is using the old Hadoop MapReduce API and so mapred.max.split.size won't work.
I would dig into source code later.