Load CSV data in to Dataframe and convert to Array using Apache Spark (Java)

后端 未结 2 1934
误落风尘
误落风尘 2021-01-21 17:20

I have a CSV file with below data :

1,2,5  
2,4  
2,3 

I want to load them into a Dataframe having schema of string of array

The outpu

2条回答
  •  感情败类
    2021-01-21 17:31

    you can use VectorAssembler class to create as array of features, which is particulary useful with pipelines:

    val assembler = new VectorAssembler()
      .setInputCols(Array("city", "status", "vendor"))
      .setOutputCol("features")
    

    https://spark.apache.org/docs/2.2.0/ml-features.html#vectorassembler

提交回复
热议问题