I am trying to copy data from SQL Table in a on-prem sql server and upload it to a Document DB using custom activity in Azure data factory pipeline. Can anyone tell me how can I
I was able to solve the problem. The solution is to write the code in custom activity itself that copies data from On-Prem SQL Server to DocumentDB by using the below code:
public async Task CopyDataFromTo(string source)
{
try
{
DataTable dtSource = new DataTable();
string EndpointUrl = "https://yourendpoint.documents.azure.com:443/";
string AuthorizationKey = "*****";
SecureString authKey = new SecureString();
foreach(char c in AuthorizationKey.ToCharArray())
{
authKey.AppendChar(c);
}
SqlDataAdapter adapSource = new SqlDataAdapter("Select * From YourTable", source);
adapSource.Fill(dtSource);
foreach (DataRow Dr in dtSource.Rows)
{
dynamic docFirst = new
{
UserID = Int32.Parse(Dr["ColumnOne"].ToString()),
UserAlias = Dr["ColumnTwo"].ToString()
};
using (var client = new DocumentClient(new Uri(EndpointUrl), authKey))
{
Document newDocument = await client.CreateDocumentAsync(UriFactory.CreateDocumentCollectionUri("DatabaseName", "CollectionName"), docFirst);
};
}
}
catch (Exception Ex)
{
throw Ex;
}
}
Thanks Charles. Turns out you are right. The solution I implemented was:
Part 1:
Implemented a data factory pipeline to move data from on-prem databases to staging DocumentDB collections.
Part 2:
Used custom activity to combine data from different collections(staged) in documentdb to create a new documentdb collection with required output data.
I got this to work with conventional Azure Data Factory (ADF) tasks. No custom task is required. I wouldn't make things more complicated than they need to be particularly with these components which can be hard to debug.
The following sample shows:
Linked Service of type On Premises SQL Server:
{
"name": "OnPremLinkedService",
"properties": {
"type": "OnPremisesSqlServer",
"description": "",
"typeProperties": {
"connectionString": "Data Source=<servername - required for credential encryption>;Initial Catalog=<databasename - required for credential encryption>;Integrated Security=False;User ID=<username>;Password=<password>;",
"gatewayName": "<Name of the gateway that the Data Factory service should use to connect to the on-premises SQL Server database - required for credential encryption>",
"userName": "<Specify user name if you are using Windows Authentication>",
"password": "<Specify password for the user account>"
}
}
}
Linked Service of type DocumentDB:
{
"name": "DocumentDbLinkedService",
"properties": {
"type": "DocumentDb",
"typeProperties": {
"connectionString": "AccountEndpoint=<EndpointUrl>;AccountKey=<AccessKey>;Database=<Database>"
}
}
}
Input Dataset of type SqlServerTable:
{
"name": "SQLServerDataset",
"properties": {
"structure": [
{
"name": "Id",
"type": "Int32"
},
{
"name": "FirstName",
"type": "String"
},
{
"name": "MiddleName",
"type": "String"
},
{
"name": "LastName",
"type": "String"
}
],
"published": false,
"type": "SqlServerTable",
"linkedServiceName": "OnPremLinkedService",
"typeProperties": {
"tableName": "dbo.Users"
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {}
}
}
Output Dataset of type DocumentDbCollection:
{
"name": "PersonDocumentDbTableOut",
"properties": {
"structure": [
{
"name": "Id",
"type": "Int32"
},
{
"name": "Name.First",
"type": "String"
},
{
"name": "Name.Middle",
"type": "String"
},
{
"name": "Name.Last",
"type": "String"
}
],
"published": false,
"type": "DocumentDbCollection",
"linkedServiceName": "DocumentDbLinkedService",
"typeProperties": {
"collectionName": "Person"
},
"availability": {
"frequency": "Day",
"interval": 1
}
}
}
Pipeline with Copy activity using SqlSource and DocumentDbCollectionSink:
{
"name": "PipelineTemplate 3",
"properties": {
"description": "On prem to DocDb test",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource"
},
"sink": {
"type": "DocumentDbCollectionSink",
"writeBatchSize": 2,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "id: id, FirstName: Name.First, MiddleName: Name.Middle, LastName: Name.Last"
}
},
"inputs": [
{
"name": "SQLServerDataset"
}
],
"outputs": [
{
"name": "PersonDocumentDbTableOut"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"retry": 3
},
"scheduler": {
"frequency": "Day",
"interval": 1
},
"name": "CopyActivityTemplate"
}
],
"start": "2016-10-05T00:00:00Z",
"end": "2016-10-05T00:00:00Z",
"isPaused": false,
"hubName": "adfdocdb2_hub",
"pipelineMode": "Scheduled"
}
}
Actually, Custom activity cannot access on-prem data today.
Similar question here: On-Prem SQL connection throwing SqlException in Datafactory custom activity
The solution is copy on-prem data to cloud. Then run custom activity against cloud storage. wBob shared good sample above.
If you have to complete it in one activity, you can setup vNet and ExpressRoute to connect your Azure public cloud with your onprem environment.