I need to store the text of all of the stored procedures in a database into an XML data type. When I use, FOR XML PATH
, the text within in the stored procedure
SELECT
1 as Tag,
0 as Parent,
[View].name AS 'StoredProcedure!1!Name',
[Module].definition AS 'StoredProcedure!1!Definition!cdata'
FROM sys.views AS [View]
INNER JOIN sys.sql_modules AS [Module] ON [Module].object_id = [View].object_id
FOR XML EXPLICIT
Sample of the output from Adventureworks2012:
<StoredProcedure Name="vStoreWithContacts">
<Definition><![CDATA[
CREATE VIEW [Sales].[vStoreWithContacts] AS
SELECT
s.[BusinessEntityID]
,s.[Name]
,ct.[Name] AS [ContactType]
,p.[Title]
,p.[FirstName]
,p.[MiddleName]
,p.[LastName]
,p.[Suffix]
,pp.[PhoneNumber]
,pnt.[Name] AS [PhoneNumberType]
,ea.[EmailAddress]
,p.[EmailPromotion]
FROM [Sales].[Store] s
INNER JOIN [Person].[BusinessEntityContact] bec
ON bec.[BusinessEntityID] = s.[BusinessEntityID]
INNER JOIN [Person].[ContactType] ct
ON ct.[ContactTypeID] = bec.[ContactTypeID]
INNER JOIN [Person].[Person] p
ON p.[BusinessEntityID] = bec.[PersonID]
LEFT OUTER JOIN [Person].[EmailAddress] ea
ON ea.[BusinessEntityID] = p.[BusinessEntityID]
LEFT OUTER JOIN [Person].[PersonPhone] pp
ON pp.[BusinessEntityID] = p.[BusinessEntityID]
LEFT OUTER JOIN [Person].[PhoneNumberType] pnt
ON pnt.[PhoneNumberTypeID] = pp.[PhoneNumberTypeID];
]]></Definition>
</StoredProcedure>
<StoredProcedure Name="vStoreWithAddresses">
<Definition><![CDATA[
CREATE VIEW [Sales].[vStoreWithAddresses] AS
SELECT
s.[BusinessEntityID]
,s.[Name]
,at.[Name] AS [AddressType]
,a.[AddressLine1]
,a.[AddressLine2]
,a.[City]
,sp.[Name] AS [StateProvinceName]
,a.[PostalCode]
,cr.[Name] AS [CountryRegionName]
FROM [Sales].[Store] s
INNER JOIN [Person].[BusinessEntityAddress] bea
ON bea.[BusinessEntityID] = s.[BusinessEntityID]
INNER JOIN [Person].[Address] a
ON a.[AddressID] = bea.[AddressID]
INNER JOIN [Person].[StateProvince] sp
ON sp.[StateProvinceID] = a.[StateProvinceID]
INNER JOIN [Person].[CountryRegion] cr
ON cr.[CountryRegionCode] = sp.[CountryRegionCode]
INNER JOIN [Person].[AddressType] at
ON at.[AddressTypeID] = bea.[AddressTypeID];
]]></Definition>
As you note there are no 
 / 
 / "/ etc
and NewLine characters is represented as new line
If you want to store data with special characters within XML, there are two options (plus a joke option)
CDATA
base64
or similar would work too :-)The only reason for CDATA
(at least for me) is manually created content (copy'n'paste or typing). Whenever you build your XML automatically, you should rely on the implicitly applied escaping.
Why does it bother you, how the data is looking within the XML?
If you read this properly (not with SUBSTRING
or other string based methods), you will get it back in the original look.
Try this:
DECLARE @TextWithSpecialCharacters NVARCHAR(100)=N'€ This is' + CHAR(13) + 'strange <ups, angular brackets! > And Ampersand &&&';
SELECT @TextWithSpecialCharacters FOR XML PATH('test');
returns
€ This is
strange <ups, angular brackets! > And Ampersand &&&
But this...
SELECT (SELECT @TextWithSpecialCharacters FOR XML PATH('test'),TYPE).value('/test[1]','nvarchar(100)');
...returns
€ This is
strange <ups, angular brackets! > And Ampersand &&&
Microsoft decided not even to support this with FOR XML
(except EXPLICIT
, which is a pain in the neck...)
Read two related answers (by me :-) about CDATA)
When I use, FOR XML PATH, the text within in the stored procedure contains serialized data characters like and for CRLF and ", etc.
Yes, because that's how XML works. To take a clearer example, suppose your sproc contained this text:
IF @someString = '<' THEN
then to store it in XML, there must be some kind of encoding applied, since you can't have a bare <
in the middle of your XML (I hope you can see why).
The real question is then not 'how do I stop my text being encoded when I store it as XML', but rather (as you guess might be the case):
Or when I parse the xml data type to recreate the stored procedure can I deserialize it so that it does not have those characters?
Yes, this is the approach you should be looking at.
You don't how us how you're getting your text out of the XML at the moment. The key thing to remember is that you can't (or rather shouldn't) treat XML as 'text with extra bits' - you should use methods that understand XML.
If you're extracting the text in T-SQL itself, use the various XQuery options. If in C#, use any of the various XML libraries. Just don't do a substring operation and expect that to work...
An example, if you are extracting in T-SQL:
DECLARE @someRandomText nvarchar(max) = 'I am some arbitrary text, eg a sproc definition.
I contain newlines
And arbitrary characters such as < > &
The end.';
-- Pack into XML
DECLARE @asXml xml = ( SELECT @someRandomText FOR XML PATH ('Example'), TYPE );
SELECT @asXml;
-- Extract
DECLARE @textOut nvarchar(max) = ( SELECT @asXml.value('.', 'nvarchar(max)') ) ;
SELECT @textOut;
But you can find many many tutorials on how to get values out of xml-typed data; this is just an example.