Beginner ES Question here
What is the workflow or steps for pushing a Spark Dataframe to Elastic Search?
From research, I believe I need to use
This worked for me - I had my data in df
.
df = df.drop('_id')
df.write.format(
"org.elasticsearch.spark.sql"
).option(
"es.resource", '%s/%s' % (conf['index'], conf['doc_type'])
).option(
"es.nodes", conf['host']
).option(
"es.port", conf['port']
).save()
I had used this command to submit my job - /path/to/spark-submit --master spark://master:7077 --jars ./jar_files/elasticsearch-hadoop-5.6.4.jar --driver-class-path ./jar_files/elasticsearch-hadoop-5.6.4.jar main_df.py
.