How to avoid Spark executor from getting lost and yarn container killing it due to memory limit?
I have the following code which fires hiveContext.sql() most of the time. My task is I want to create few tables and insert values into after processing for all hive table partition. So I first fire show partitions and using its output in a for-loop, I call a few methods which creates the table (if it doesn't exist) and inserts into them using hiveContext.sql . Now, we can't execute hiveContext in an executor, so I have to execute this in a for-loop in a driver program, and should run serially one by one. When I submit this Spark job in YARN cluster, almost all the time my executor gets lost