问题
[Previously in this post I asked how to provision a databricks services without any workspace. Now I'm asking how to provision a service with a workspace as the first scenario seems unfeasible.]
As a cloud admin I'm asked to write a script using the Azure Python SDK which will provision a Databricks service for one of our big data dev teams.
I can't find much online about Databricks within the Azure Python SDK other than https://azuresdkdocs.blob.core.windows.net/$web/python/azure-mgmt-databricks/0.1.0/azure.mgmt.databricks.operations.html
and
https://azuresdkdocs.blob.core.windows.net/$web/python/azure-mgmt-databricks/0.1.0/azure.mgmt.databricks.html
These appear to offer some help provisioning a workspace, but I am not quite there yet.
What am I missing?
EDITS:
Thanks to @Laurent Mazuel and @Jim Xu for their help.
Here's the code I'm running now, and the error I'm receiving:
client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get("example_rg_name", "example_databricks_workspace_name")
WorkspacesOperations.create_or_update(
workspace_obj,
"example_rg_name",
"example_databricks_workspace_name",
custom_headers=None,
raw=False,
polling=True
)
error:
TypeError: create_or_update() missing 1 required positional argument: 'workspace_name'
I'm a bit puzzled by that error as I've provided the workspace name as the third parameter, and according to this documentation, that's just what this method requires.
I also tried the following code:
client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get("example_rg_name", "example_databricks_workspace_name")
client.workspaces.create_or_update(
workspace_obj,
"example_rg_name",
"example_databricks_workspace_name"
)
Which results in:
Traceback (most recent call last):
File "./build_azure_visibility_core.py", line 112, in <module>
ca_databricks.create_or_update_databricks(SUB_PREFIX)
File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/expd_az_databricks.py", line 34, in create_or_update_databricks
self.databricks_workspace_name
File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/azure/mgmt/databricks/operations/workspaces_operations.py", line 264, in create_or_update
**operation_config
File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/azure/mgmt/databricks/operations/workspaces_operations.py", line 210, in _create_or_update_initial
body_content = self._serialize.body(parameters, 'Workspace')
File "/home/gitlab-runner/builds/XrbbggWj/0/SA-Cloud/azure-visibility-core/azure-visibility-core/lib64/python3.6/site-packages/msrest/serialization.py", line 589, in body
raise ValidationError("required", "body", True)
msrest.exceptions.ValidationError: Parameter 'body' can not be None.
ERROR: Job failed: exit status 1
So Line 589 in serialization.py has an error. I don't see where an error in my code is causing that. Thanks to all who have been generous to assist!
回答1:
you need to create a databrick client, and workspaces will be attached to it:
client = DatabricksClient(credentials, subscription_id)
workspace = client.workspaces.get(resource_group_name, workspace_name)
I don't think creating a service without a workspace is even possible, trying to create databricks service on the portal, you will see workspace name is required as well
so using the SDK I would look at the doc for client.workspaces.create_or_update
(I work at MS in the SDK team)
回答2:
with help from @Laurent Mazuel and support engineers at Microsoft, I have a solution:
managed_resource_group_ID = ("/subscriptions/"+sub_id+"/resourceGroups/"+managed_rg_name)
client = DatabricksClient(credentials, subscription_id)
workspace_obj = client.workspaces.get(rg_name, databricks_workspace_name)
client.workspaces.create_or_update(
{
"managedResourceGroupId": managed_resource_group_ID,
"sku": {"name":"premium"},
"location":location
},
rg_name,
databricks_workspace_name
).wait()
来源:https://stackoverflow.com/questions/62902691/how-to-use-the-azure-python-sdk-to-provision-a-databricks-service