How to use Delta Lake with spark-shell?

前端 未结 3 619
心在旅途
心在旅途 2021-01-15 20:37

I\'m trying to write as Spark DF as a DeltaTable. It\'s working fine in my IDE Intelliji , But with the same dependencies and versions it\'s not working in

相关标签:
3条回答
  • 2021-01-15 20:49
    bin/spark-shell --packages io.delta:delta-core_2.11:0.6.1
    
    import io.delta.tables._
    import org.apache.spark.sql.functions._
    
    val deltaTable = DeltaTable.forPath("/tmp/delta-table")
    
    0 讨论(0)
  • 2021-01-15 20:55

    As per the official documentation:

    Delta Lake requires Apache Spark version 2.4.2 or above

    Please upgrade your Spark version to at least 2.4.2 in IntelliJ IDEA (or issues show up). The latest version as of this writing is 2.4.4.

    As per the official documentation:

    Run spark-shell with the Delta Lake package:

    bin/spark-shell --packages io.delta:delta-core_2.11:0.5.0

    From myself, use --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension to enable Delta Lake's SQL commands, e.g. DESCRIBE DETAIL, GENERATE.

    The entire command to run spark-shell with Delta Lake 0.5.0 should be as follows:

    ./bin/spark-shell \
      --packages io.delta:delta-core_2.11:0.5.0 \
      --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
    
    0 讨论(0)
  • 2021-01-15 21:13

    Spark itself has a dependency on Jackson, and the version you're instructing spark-shell to use is incompatible. Per https://github.com/apache/spark/blob/v2.4.0/pom.xml, 2.4.0 uses Jackson 2.6.7. Is there a particular reason that you need Jackson 2.10 in this case?

    0 讨论(0)
提交回复
热议问题