问题
I am trying to use Get Metadata activity in Azure Data Factory in order to get blob filenames and copy them to Azure SQL database table. I follow this tutorial: https://www.mssqltips.com/sqlservertip/6246/azure-data-factory-get-metadata-example/
Here is my pipeline, Copy Data > Source is the source destination of the blob files in my Blob storage. I need to specify my source file as binary because they are *.jpeg files.
For my Copy Data > Sink, its the Azure SQL database, I enable the option "Auto Create table"
In my Sink dataset config, I had to choose one table because the validation won't pass if I don't select the table in my SQL database even though this table is not related at all to the blob filenames that I want to get.
Question 1: Am I supposed to create a new table in SQL DB before to have the columns matching the blob filenames that I want to extract?
Then, I tried to validate the pipeline and I get this error.
Copy_Data_1
Sink must be binary when source is binary dataset.
Question 2: How can I resolve this error? I had to select the file type of the source as binary as it's one of the step when creating source dataset. Therefore, when I choose sink dataset that is Azure SQL table, I didn't have to select the type of dataset so it doesn't seem to match.
Thank you very much in advance.
New screenshot of the new pipeline, I can now get itemName of filenames in the json output files.
Now I add Copy Data activity just after Get_File_Name2 activity and connect them together to try to get the json output files as source dataset.
However, I need to choose the source dataset location first before specify type as json. But, as far as I understand these output json files are the output from Get_File_Name2 activity and they are not yet stored on Blob storage. How do I make the copy data activity reading these json output file as source dataset?
Update 10/14/2020 Here is my new activity stored procedure, I added the parameter as suggested however, I changed the name to JsonData as my stored procedure requires this parameter.
This is my stored procedure.
I get this error at the stored procedure:
{
"errorCode": "2402",
"message": "Execution fail against sql server. Sql error number: 13609. Error Message: JSON text is not properly formatted. Unexpected character 'S' is found at position 0.",
"failureType": "UserError",
"target": "Stored procedure1",
"details": []
}
But when I check the input, it seems like it already successfully reading the json string itemName.
But, when I check output, it's not there.
回答1:
Actually, you may could using Get metadata output json as the parameter and then call the stored procedure: Get metedata
-->Stored Procedure
!
You just need focus on the coding of the stored procedure.
Get Metadata output childitems
:
{
"childItems": [
{
"name": "DeploymentFiles.zip",
"type": "File"
},
{
"name": "geodatalake.pdf",
"type": "File"
},
{
"name": "test2.xlsx",
"type": "File"
},
{
"name": "word.csv",
"type": "File"
}
}
Stored Procedure:
@activity('Get Metadata1').output.childitems
About how to create the stored procedure(get data from json object), you could ref this blog: Retrieve JSON Data from SQL Server using a Stored Procedure.
来源:https://stackoverflow.com/questions/64227251/azure-data-factory-get-metadata-to-get-blob-filenames-and-transfer-them-to-azure