问题
In the es cluster, it has a large scale data, we used spark to compute data but in the way of elasticsearch-hadoop
, followed by https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html
We have to read full columns of an index. Is there anything that help?
回答1:
Yes, you can set config parameter "es.read.field.include" or "es.read.field.exclude" respectively. Full details here. Example assuming Spark 2 or higher.
val sparkSession:SparkSession = SparkSession
.builder()
.appName("jobName")
.config("es.nodes", "elastichostc1n1.example.com")
.config("es.read.field.include", "foo,bar")
.getOrCreate()
来源:https://stackoverflow.com/questions/43772732/how-to-read-a-few-columns-of-elasticsearch-by-spark