Why does reading csv file with empty values lead to IndexOutOfBoundException?

后端 未结 4 1238
無奈伤痛
無奈伤痛 2021-01-19 19:10

I have a csv file with the foll struct

Name | Val1 | Val2 | Val3 | Val4 | Val5
John     1      2
Joe      1      2
David    1      2            10    11
         


        
4条回答
  •  不知归路
    2021-01-19 19:51

    This is not answer to your question. But it may help to solve your problem.

    From the question I see that you are trying to create a dataframe from a CSV.

    Creating dataframe using CSV can be easily done using spark-csv package

    With the spark-csv below scala code can be used to read a CSV val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load(csvFilePath)

    For your sample data I got the following result

    +-----+----+----+----+----+----+
    | Name|Val1|Val2|Val3|Val4|Val5|
    +-----+----+----+----+----+----+
    | John|   1|   2|    |    |    |
    |  Joe|   1|   2|    |    |    |
    |David|   1|   2|    |  10|  11|
    +-----+----+----+----+----+----+
    

    You can also inferSchema with latest version. See this answer

提交回复
热议问题