Execute python scripts in Azure DataFactory

牧云@^-^@ 提交于 2020-07-05 11:04:28

问题


I have my data stored in blobs and I have written a python script to do some computations and create another csv. How can I execute this in Azure Data Factory ?


回答1:


Mighty. You could use Azure Data Factory V2 custom activity for your requirements. You can directly execute a command to invoke python script using Custom Activity.

Please refer to this sample on the github.

Hope it helps you.




回答2:


Another option is using a DatabricksSparkPython Activity. This makes sense if you want to scale out, but could require some code modifications for PySpark support. Prerequisite of cause is an Azure Databricks workspace. You have to upload your script to DBFS and can trigger it via Azure Data Factory. The following example triggers the script pi.py:

{
    "activity": {
        "name": "MyActivity",
        "description": "MyActivity description",
        "type": "DatabricksSparkPython",
        "linkedServiceName": {
            "referenceName": "MyDatabricksLinkedservice",
             "type": "LinkedServiceReference"
        },
        "typeProperties": {
            "pythonFile": "dbfs:/docs/pi.py",
            "parameters": [
                "10"
            ],
            "libraries": [
                {
                    "pypi": {
                        "package": "tensorflow"
                    }
                }
            ]
        }
    }
}

See the Documentation for more details.



来源:https://stackoverflow.com/questions/52271088/execute-python-scripts-in-azure-datafactory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!