I have an xml as follows:
Best of Pop
ABC studio
This can't be done in a single pass anyway as you can't insert into two tables in the same DML statement (well, outside of Triggers and the OUTPUT clause, neither of which would help here). But it can be done efficiently in two passes. The fact at the <Name>
element within <Record>
is unique is the key, as that allows us to use the Record
table as the lookup table for the second pass (i.e. when we are getting the Artist
rows).
First, you need (well, should) create a UNIQUE INDEX
on Record (Name ASC)
. In my example below I am using a UNIQUE CONSTRAINT
, but that is only due to my using a table variable instead of a temp table to make the example code more easily rerunnable (without needing an explicit IF EXISTS DROP at the top). This index will help the performance of the second pass.
The example uses OPENXML as that will most likely be more efficient that using the .nodes()
function since the same document needs to be traversed twice. The last parameter for the OPENXML
function, the 2
, specifies that the document is "Element-based" since the default parsing is looking for "Attribute-based".
DECLARE @DocumentID INT, @ImportData XML;
SET @ImportData = N'
<Records>
<Record>
<Name>Best of Pop</Name>
<Studio>ABC studio</Studio>
<Artists>
<Artist>
<ArtistName>John</ArtistName>
<Age>36</Age>
</Artist>
<Artist>
<ArtistName>Jessica</ArtistName>
<Age>20</Age>
</Artist>
</Artists>
</Record>
<Record>
<Name>Nursery rhymes</Name>
<Studio>XYZ studio</Studio>
<Artists>
<Artist>
<ArtistName>Judy</ArtistName>
<Age>10</Age>
</Artist>
<Artist>
<ArtistName>Rachel</ArtistName>
<Age>15</Age>
</Artist>
</Artists>
</Record>
</Records>';
DECLARE @Record TABLE (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
DECLARE @Artist TABLE (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400), Age INT);
EXEC sp_xml_preparedocument @DocumentID OUTPUT, @ImportData;
-- First pass: extract "Record" rows
INSERT INTO @Record (Name, Studio)
SELECT Name, Studio
FROM OPENXML (@DocumentID, N'/Records/Record', 2)
WITH (Name NVARCHAR(400) './Name/text()',
Studio NVARCHAR(400) './Studio/text()');
-- Second pass: extract "Artist" rows
INSERT INTO @Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId, art.ArtistName, art.Age
FROM OPENXML (@DocumentID, N'/Records/Record/Artists/Artist', 2)
WITH (Name NVARCHAR(400) '../../Name/text()',
ArtistName NVARCHAR(400) './ArtistName/text()',
Age INT './Age/text()') art
INNER JOIN @Record rec
ON rec.[Name] = art.[Name];
EXEC sp_xml_removedocument @DocumentID;
-------------------
SELECT * FROM @Record ORDER BY [RecordID];
SELECT * FROM @Artist ORDER BY [RecordID];
References:
EDIT:
With the new requirement to use the .nodes()
function instead of OPENXML
, the following will work:
DECLARE @ImportData XML;
SET @ImportData = N'
<Records>
<Record>
<Name>Best of Pop</Name>
<Studio>ABC studio</Studio>
<Artists>
<Artist>
<ArtistName>John</ArtistName>
<Age>36</Age>
</Artist>
<Artist>
<ArtistName>Jessica</ArtistName>
<Age>20</Age>
</Artist>
</Artists>
</Record>
<Record>
<Name>Nursery rhymes</Name>
<Studio>XYZ studio</Studio>
<Artists>
<Artist>
<ArtistName>Judy</ArtistName>
<Age>10</Age>
</Artist>
<Artist>
<ArtistName>Rachel</ArtistName>
<Age>15</Age>
</Artist>
</Artists>
</Record>
</Records>';
IF (OBJECT_ID('tempdb..#Record') IS NOT NULL)
BEGIN
DROP TABLE #Record;
END;
IF (OBJECT_ID('tempdb..#Artist') IS NOT NULL)
BEGIN
DROP TABLE #Artist;
END;
CREATE TABLE #Record (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
CREATE TABLE #Artist (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400),
Age INT);
-- First pass: extract "Record" rows
INSERT INTO #Record (Name, Studio)
SELECT col.value(N'(./Name/text())[1]', N'NVARCHAR(400)') AS [Name],
col.value(N'(./Studio/text())[1]', N'NVARCHAR(400)') AS [Studio]
FROM @ImportData.nodes(N'/Records/Record') tab(col);
-- Second pass: extract "Artist" rows
;WITH artists AS
(
SELECT col.value(N'(../../Name/text())[1]', N'NVARCHAR(400)') AS [RecordName],
col.value(N'(./ArtistName/text())[1]', N'NVARCHAR(400)') AS [ArtistName],
col.value(N'(./Age/text())[1]', N'INT') AS [Age]
FROM @ImportData.nodes(N'/Records/Record/Artists/Artist') tab(col)
)
INSERT INTO #Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId, art.ArtistName, art.Age
FROM artists art
INNER JOIN #Record rec
ON rec.[Name] = art.RecordName;
-- OR --
-- INSERT INTO #Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId,
col.value(N'(./ArtistName/text())[1]', N'NVARCHAR(400)') AS [ArtistName],
col.value(N'(./Age/text())[1]', N'INT') AS [Age]
FROM @ImportData.nodes(N'/Records/Record/Artists/Artist') tab(col)
INNER JOIN #Record rec
ON rec.Name = col.value(N'(../../Name/text())[1]', N'NVARCHAR(400)');
-------------------
SELECT * FROM #Record ORDER BY [RecordID];
SELECT * FROM #Artist ORDER BY [RecordID];
There are two options for inserting into #Artist
shown above. The first uses a CTE to abstract the XML extraction away from the INSERT / SELECT query. The other is a simplified version, similar to your query in UPDATE 2 of the question.