How To Push a Spark Dataframe to Elastic Search (Pyspark)

后端 未结 2 1697
温柔的废话
温柔的废话 2021-02-06 09:13

Beginner ES Question here

What is the workflow or steps for pushing a Spark Dataframe to Elastic Search?

From research, I believe I need to use

2条回答
  •  不知归路
    2021-02-06 09:54

    This worked for me - I had my data in df.

    df = df.drop('_id')
    df.write.format(
        "org.elasticsearch.spark.sql"
    ).option(
        "es.resource", '%s/%s' % (conf['index'], conf['doc_type'])
    ).option(
        "es.nodes", conf['host']
    ).option(
        "es.port", conf['port']
    ).save()
    

    I had used this command to submit my job - /path/to/spark-submit --master spark://master:7077 --jars ./jar_files/elasticsearch-hadoop-5.6.4.jar --driver-class-path ./jar_files/elasticsearch-hadoop-5.6.4.jar main_df.py.

提交回复
热议问题