问题
I am trying to use Delta Lake in a Zeppelin notebook with pyspark and seems it cannot import the module successfully. e.g.
%pyspark
from delta.tables import *
It fails with the following error:
ModuleNotFoundError: No module named 'delta'
However, there is no problem to save/read the data frame using delta
format. And the module can be loaded successfully if using scala spark %spark
Is there any way to use Delta Lake in Zeppelin and pyspark?
回答1:
Finally managed to load it on zeppelin pyspark. Have to explicitly include the jar file
%pyspark
sc.addPyFile("**LOCATION_OF_DELTA_LAKE_JAR_FILE**")
from delta.tables import *
来源:https://stackoverflow.com/questions/59170595/how-to-import-delta-lake-module-in-zeppelin-notebook-and-pyspark