I want to share large in memory static data(RAM lucene index) for my map tasks in Hadoop? Is there way for several map/reduce tasks to share same JVM?
In $HADOOP_HOME/conf/mapred-site.xml add the follow property
$HADOOP_HOME/conf/mapred-site.xml
mapred.job.reuse.jvm.num.tasks #
The # can be set to a number to specify how many times the JVM is to be reused (default is 1), or set to -1 for no limit on the reuse amount.
#
1
-1