问题
On Databricks I would like to install a Maven library through commands in a Python Notebook if its not already installed.
If it were a Python PyPI library I would do something like the following:
# Get a list of all available library
library_name_list = dbutils.library.list()
# Suppose the library of interest was "scikit-learn"
if "scikit-learn" not in library_name_list:
# Install the library
dbutils.library.installPyPI("scikit-learn")
How can I do the same for a Maven library "com.microsoft.azure.kusto:spark-kusto-connector:2.0.0" i.e. check if it is already installed, if not then install?
I can install the Maven library using the UI by going to "Clusters" -> "Libraries" -> "Install New" -> "Maven". But I would like to do it programmatically through a script.
回答1:
Note: Library utilities (
dbutils.library.installPyPI("")
) allow you to install Python libraries and create an environment scoped to a notebook session.
Here are the steps to programmatically way to install libraries from the maven repository:
You can use Databricks CLI to install maven libraries in Azure Databricks.
Step1: From maven coordinates, you can go to Maven Repository and pick the version which you are looking for and note the dependency (groupId, artifactId, and Version).
Step2 Get the cluster-ID using databricks CLI.
To get the cluster-ID: databricks clusters list
Step3: Use the below Databricks CLI command to install 'com.microsoft.azure.kusto:spark-kusto-connector:2.0.0
' in Databricks.
Syntax: databricks libraries install --cluster-id "Cluster ID" --maven-coordinates "GroupId:ArtifactId:Version" (i.e.org.jsoup:jsoup:1.7.2)
To install the maven library using databricks CLI: databricks libraries install --cluster-id "1013-095611-mazes551" --maven-coordinates "com.microsoft.azure.kusto:spark-kusto-connector:2.0.0"
To check the install libraries on the cluster: databricks libraries list --cluster-id "1013-095611-mazes551"
For different methods to install packages in Azure Databricks, refer: How to install a library on a databricks cluster using some command in the notebook?
来源:https://stackoverflow.com/questions/64763497/installing-maven-library-on-databricks-via-python-commands-and-dbutils