How does SqlBulkCopy Work

前端 未结 3 771
轮回少年
轮回少年 2021-01-04 09:32

I am familiar with the C# SqlBulkCopy class where you can call the \'WriteToServer\' method passing through a DataTable.

My question is what underlying mechanism in

3条回答
  •  执笔经年
    2021-01-04 10:03

    It took 7 years, but we finally have an answer...

    Expounding upon Sam Anwar's answer, I can confirm it is converting the data to a raw byte stream and writing it to SQL as if it were streaming in from a file. How it tricks SQL into thinking it's reading a file is beyond me.

    I wanted to do a bulk insert from inside a query, to speed up a slow clustered index insert. Upon finding your post here, somehow I became disturbingly intrigued, so I spent the past several hours studying it.

    The execution path that actually writes data to the server seems to be:

    Your Code:

    1. Your code calls System.Data.SqlClient.SqlBulkCopy.WriteToServer()

    inside System.Data.SqlClient.SqlBulkCopy:

    1. which calls WriteRowSourceToServerAsync()
    2. which calls WriteRowSourceToServerCommon() to map the columns and WriteToServerInternalAsync() to write the data
    3. which calls WriteToServerInternalRestContinuedAsync()
    4. which calls AnalyzeTargetAndCreateUpdateBulkCommand() (This is the answer. Skip to step 14 to read about it.) and CopyBatchesAsync()
    5. which (CopyBatchesAsync) calls SubmitBulkUpdateCommand()

    -- inside System.Data.SqlClient.TdsParser:

    1. which calls System.Data.SqlClient.TdsParser.TdsExecuteSQLBatch()
    2. which calls WriteString() or similar methods to convert the data into a byte array
    3. which calls WriteByteArray()
    4. which calls WritePacket()
    5. which calls WriteSni()
    6. which calls SNIWritePacket()

    -- inside System.Data.SqlClient.SNINativeMethodWrapper:

    1. which calls System.Data.SqlClient.SNINativeMethodWrapper.SNIWritePacket()
    2. which extern calls SNIWriteAsyncWrapper() or SNIWriteSyncOverAsync()

    Now here's where it gets tricky. I think this follows, but how I got there is a bit hacky. I opened the file properties on my copy of sni.dll, went to the details tab, and inside the Product Version property I found a reference to a "commit hash" of d0d5c7b49271cadb6d97de26d8e623e98abdc8db.

    So I googled that hash, and via this Nuget search I found this Nuget package, whose title includes "System.Data.SqlClient.sni", which implies the namespace System.Data.SqlClient.SNI, which I found here, but this doesn't have the right methods and doesn't actually seem to communicate with a server.

    So this is where I ran out of know-how; this is as deep as I could get before it goes into native code I can't find anywhere. And although I'm not sure what all that other noise up above was...

    1. Remember Step 4 (WriteToServerInternalRestContinuedAsync()) also calls AnalyzeTargetAndCreateUpdateBulkCommand()
    2. which concatenates a SQL query inside a StringBuilder named updateBulkCommandText. Line 544 in that last link.

    TLDR: Ultimately it appears it just executes an INSERT BULK query (which does not require a file), and does not actually use BULK INSERT (which does). Note these two commands look very similar.

    An important note in the Microsoft docs:

    Used by external tools to upload a binary data stream. This option is not intended for use with tools such as SQL Server Management Studio, SQLCMD, OSQL, or data access application programming interfaces such as SQL Server Native Client.

    Which I interpret as "use at your own risk and don't expect help". Which is almost as good as a green light, in all fairness.

提交回复
热议问题