Is it possible to connect to databricks deltalake tables from adf

廉价感情. 提交于 2020-07-03 10:10:30

问题


I'm looking for a way to be able to connect to Databricks deltalake tables from ADF and other Azure Services(like Data Catalog). I don't see databricks data store listed in ADF data sources.

On a similar question - Is possible to read an Azure Databricks table from Azure Data Factory?

@simon_dmorias seems to have suggested using ODBC connection to connect to databricks tables.

I tried to set up the ODBC connection but it requires IR to be setup. There are 2 options I see when creating the IR. Self-hosted and linked Self-hosted. I tried to create the Self-hosted IR but it requires installation on my local desktop and probably is more meant for an on-premise odbc connection. I couldn't use the IR on my linked Services.

I have been able to connect powerbi with databricks deltalake tables and plan to use the same creds here. Here is the reference link -

https://docs.azuredatabricks.net/user-guide/bi/power-bi.html

Any guidance will be helpful


回答1:


You can but it is quite complex. You need to use the ODBC connector in Azure Data Factory with a self hosted runtime.

ADF can connect using ODBC (https://docs.microsoft.com/en-us/azure/data-factory/connector-odbc). It does require a self hosted IR. Assuming you have the right drivers installed you can configure the ODBC connection to a Databricks cluster.

The connections details for the ODBC settings can be found in cluster settings screen in the Databricks workspace (https://docs.microsoft.com/en-us/azure/azure-databricks/connect-databricks-excel-python-r).

The process is very similar to what you posted for PowerBI.




回答2:


Please refer to the section Azure Data Factory of Azure Databricks offical document User Guide > Developer Tools > Managing Dependencies in Data Pipelines. And you will see there are two Azure documents list in the topic about how to create a Databricks notebook with the Databricks Notebook Activity and run it to do the transfer data task in Azure Data Factory, as below. I think it will help you to realize your needs.

  1. Run a Databricks notebook with the Databricks Notebook Activity in Azure Data Factory
  2. Transform data by running a Databricks notebook


来源:https://stackoverflow.com/questions/57917908/is-it-possible-to-connect-to-databricks-deltalake-tables-from-adf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!