NameError: name 'dbutils' is not defined in pyspark

后端 未结 3 967
梦如初夏
梦如初夏 2021-01-18 09:17

I am running a pyspark job in databricks cloud. I need to write some of the csv files to databricks filesystem (dbfs) as part of this job and also i need to use some of the

相关标签:
3条回答
  • 2021-01-18 10:00

    To access the DBUtils module in a way that works both locally and in Azure Databricks clusters, on Python, use the following get_dbutils():

    def get_dbutils(spark):
      try:
        from pyspark.dbutils import DBUtils
        dbutils = DBUtils(spark)
      except ImportError:
        import IPython
        dbutils = IPython.get_ipython().user_ns["dbutils"]
      return dbutils
    

    See: https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect

    0 讨论(0)
  • 2021-01-18 10:02

    Try to use this:

    def get_dbutils(spark):
            try:
                from pyspark.dbutils import DBUtils
                dbutils = DBUtils(spark)
            except ImportError:
                import IPython
                dbutils = IPython.get_ipython().user_ns["dbutils"]
            return dbutils
    
    dbutils = get_dbutils(spark)
    
    0 讨论(0)
  • 2021-01-18 10:19

    yes! You could use this:

    pip install DBUtils
    import DBUtils
    
    0 讨论(0)
提交回复
热议问题