Azure Databricks: Accessing Blob Storage Behind Firewall

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-24 12:53:48

问题


I am reading files on an Azure Blob Storage account (gen 2) from an Azure Databricks Notebook. Both services are in the same region (West Europe). Everything works fine, except when I add a firewall in front of the storage account. I have opted to allow "trusted Microsoft services":

However, running the notebook now ends up with an access denied error:

com.microsoft.azure.storage.StorageException: This request is not authorized to perform this operation.

I tried to access the storage directly from Spark and by mounting it with dbutils, but same thing.

I would have assumed that Azure Databricks counts as a trusted Microsoft service? Furthermore I couldn't find solid information on IP ranges for Databricks regions that could be added to the firewall rules.


回答1:


Yes, the Azure Databricks does not count as a trusted Microsoft service, you could see the supported trusted Microsoft services with the storage account firewall.

From networking, Here are two suggestions:

  1. Find the Azure datacenter IP address and scope a region where your Azure Databricks located. Whitelist the IP list in the storage account firewall.

  2. Deploy Azure Databricks in your Azure Virtual Network (Preview) then whitelist the VNet address range in the firewall of the storage account. You could refer to configure Azure Storage firewalls and virtual networks. Also, you have NSG to restrict inbound and outbound traffics from this Azure VNet. Note: you need to deploy Azure Databricks to your own VNet.

Hope this helps.




回答2:


The described scenario only works if you deploy Azure Databricks in your own Azure Virtual Network (vnet). With this you are able to use Service Endpoints, so could add your Databricks vnet to the Blob Storage. With the default deployment this is not supported and not possible. See the following Documentation for more details and a description how to get the vnet-injection feature enabled.

Enabling the mentioned exception does not work, as Azure Databricks is not in the list of trusted Services for Blob Storage. See the following Documentation which services still can access the storage account with the exception enabled.



来源:https://stackoverflow.com/questions/54018584/azure-databricks-accessing-blob-storage-behind-firewall

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!