Output XML Files with encoding UTF-8 using SQL Server

前端 未结 2 701
逝去的感伤
逝去的感伤 2021-01-22 08:15

I have a query that generates XML files and loads them to FTP with .

I need to switch encoding to UTF-8 as follows:

         


        
相关标签:
2条回答
  • 2021-01-22 08:43
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    
    CREATE PROCEDURE [dbo].[MyXMLTest]
    @FileDestinationDir VARCHAR(2000)
    
    -- to call procedure specify your own file path 
    -- EXEC [Audit_DBA].[dbo].[MyXMLTest] 'E:\NLP\GovwinIQ_Ontology\NewFolder'
    
    AS 
    
    SET QUOTED_IDENTIFIER ON
    
    IF OBJECT_ID (N'InputTemp.dbo.XMLTest', N'U') IS NOT NULL
    DROP TABLE InputTemp.dbo.XMLTest;
    
    CREATE TABLE InputTemp.dbo.XMLTest
    
    (
    [Id] INT NOT NULL,
    [FirstName] VARCHAR(100) NOT NULL,
    [LastName] VARCHAR(100) NOT NULL,
    [Address] VARCHAR(100) NOT NULL
    );
    
    INSERT INTO InputTemp.dbo.XMLTest ([Id], [FirstName], [LastName], [Address])
    VALUES (12, 'Zhuk', 'Termik', '123 Gam Str, Boston, NY');
    
    --SELECT * FROM InputTemp.dbo.XMLTest
    
    DECLARE @FilePath VARCHAR(4000)
    
    DECLARE @SQLStr NVARCHAR(4000),
            @Cmd NVARCHAR(4000),
            @Ret INT
    
    DECLARE @Id INT;
    
    SELECT @Id = 12;
    
    SELECT @SQLStr = 
    'SELECT N''<?xml version=''''1.0'''' encoding=''''UTF-8''''?>'' + (SELECT CAST((SELECT [Id], [FirstName], [LastName], [Address] FROM InputTemp.dbo.XMLTest AS Body WHERE Id = '''  + str(@Id) + ''' FOR XML AUTO, ELEMENTS) AS NVARCHAR(MAX)))'
    
    SELECT @SQLStr AS SQLStr
    
    SELECT @FilePath = @FileDestinationDir+'\NewFolder'+ltrim(rtrim(str(@Id)))+'.xml' 
    
    SELECT @Cmd = ' bcp " ' + @SQLStr + '" queryout '+@FilePath+' -c  -C65001 -r "" -T -S ' +@@ServerName 
    
    EXEC @Ret = master.dbo.xp_cmdshell @Cmd 
    
    IF OBJECT_ID (N'InputTemp.dbo.XMLTest', N'U') IS NOT NULL
    DROP TABLE InputTemp.dbo.XMLTest;
    
    GO
    
    0 讨论(0)
  • 2021-01-22 08:50

    There are some things to know:

    • SQL Server does not support export via BCP to UTF-8 before version 2016 (and 2014 with SP2).
    • One cannot add the xml-declaration (<?xml blah ?>) to a native SQL-Server XML typed variable or column. This will either fail ("...switch the encoding") or the xml-declaration will disappear.
    • You can add the xml-declaration on string level to an xml casted to NVARCHAR(MAX). But you cannot re-cast (re-convert) this to an XML without failing or losing the declaration.
    • Internally SQL-Server keeps any XML as UCS-2 (very close to UTF-16) in any case.
    • SQL-Servers VARCHAR (CHAR) type is not utf-8 but extended ASCII (depending on a COLLATION)
    • on string level you can write literally anything into the xml-declaration (as you can creat something, which looks like XML, but is not well-formed. This is just an unchecked string.
    • The encoding stated in the xml-declaration is important only to mark the actual file encoding when written to a disk or when handled as byte stream.
    • You can write encoding="x" and store the file with an encoding of y - but you shouldn't.
    • SQL-Server will cast a string with an utf-8 declaration to XML when it is VARCHAR and it will cast a string with utf-16 when it is NVARCHAR, but you cannot cross this (Read this related answer). Other encodings very likely lead to cannot switch the encoding error.

    About your code

    • You should change @SQLStr and @cmd to NVARCHAR(MAX), othewise you might get in troubles with non-plain-latin characters.
    • As you are using a CURSOR, you should fill an XML-typed variable with the result of your statement, cast this to NVARCHAR(MAX) and add the declaration to this string. Do not cast the result back to XML.
    • Read the BCP docs. Stating -w will write unicode (wide), which is not utf-8 (what you write into the declaration has no effect here).

    Hint:

    Read this related answer, showing utf-8 export with BCP on SQL-Server 2016

    0 讨论(0)
提交回复
热议问题