Read any JSON into list of key-value pairs (EAV format) in SQL Server

前端 未结 1 1976
陌清茗
陌清茗 2021-01-21 16:04

Looking for a way to read any JSON without knowing the internals with SQL Server\'s JSON methods, I came up with an approach I\'d like to share.

Here is the question bri

相关标签:
1条回答
  • 2021-01-21 16:23

    First we create a declared table variable and fill it with some sample JSONs to simulate the issue (I added some arrays to the samples to reflect JSON paths for arrays):

    DECLARE @table TABLE(ID INT IDENTITY, AnyJSON NVARCHAR(MAX));
    INSERT INTO @table VALUES
    (N' {
        "correlationId": "c3xOeEEQQCCA9sEx7-u6FA",
        "eventCreateTime": "2020-05-12T15:38:23.717Z",
        "time": 1589297903717,
        "owner": {
            "ownergeography": {
                "city": "abc",
                "country": "abc"
            },
            "ownername": {
                "firstname": "abc",
                "lastname": "def"
            },
            "clientApiKey": "xxxxx",
            "businessProfileApiKey": null,
            "userId": null
        },
        "campaignType": "Mobile push"
    }')
    ,(N'[{
        "correlationIds": [
            {
                "campaignId": [1,2,3],
                "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
            }
        ],
        "variantId": 1278915,
        "utmCampaign": "",
        "ua.os.major": "8"
        }
        ,{
        "correlationIds": [
            {
                "campaignId": [1,2,3],
                "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
            }
        ],
        "variantId": 1278915,
        "utmCampaign": "",
        "ua.os.major": "8"
        }]')
    ,(N'{
        "correlationId": "ls7XmuuiThWzktUeewqgWg",
        "eventCreateTime": "2020-05-12T12:40:20.786Z",
        "time": 1589287220786,
        "modifiedBy": {
            "clientId": null,
            "clientApiKey": "xxx",
            "businessProfileApiKey": null,
            "userId": null
        },
        "campaignType": "Mobile push"
    }');
    

    --The query

    WITH recCTE AS
    (
        SELECT ID
              ,NestLevel   = 0 
              ,ObjectIndex = CAST(1 AS bigint)                                                          
              ,SortString  = CAST(N'sort'                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
    
              ,JsonPath    = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
              ,JsonKey     = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
              ,JsonValue   = CAST(AnyJSON                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
              ,JsonType    = CAST(CASE WHEN LEFT(TRIM(AnyJSON),1)=N'[' THEN 4 ELSE 0 END AS TINYINT)
              ,NestedJSON  = CAST(CASE WHEN ISJSON(AnyJSON)=1 
                                       THEN AnyJSON 
                                       ELSE NULL END            COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
        FROM @table t
    
        UNION ALL
    
        SELECT r.ID
              ,r.NestLevel+1
              ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
              ,CAST(CONCAT(r.SortString,REPLACE(STR(ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),5),' ','0')) COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
    
              ,CAST(CONCAT(r.JsonPath, CASE WHEN r.JsonType=4 --<-- see the docs for OPENJSON()
                                            THEN CONCAT('[',A.[key],']') 
                                            ELSE '.' + A.[key] END)                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
              ,CAST(A.[key]                                                               COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
              ,CAST(r.JsonValue                                                           COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
              ,A.[type] 
              ,CAST(A.[value]                                                             COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
        FROM recCTE r
        CROSS APPLY OPENJSON(r.NestedJSON) A
        WHERE ISJSON(r.NestedJSON)=1
    )
    SELECT ID
          ,NestLevel
          ,ObjectIndex
          ,JsonPath
          ,JsonKey
          ,NestedJSON AS JsonValue
          ,SortString --<-- just to illustrate the sorting, not needed in the output
    FROM recCTE 
    WHERE ISJSON(NestedJSON)=0
    ORDER BY ID,SortString;
    

    The result

    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | ID | JsonPath                                  | JsonKey         | JsonValue                | SortString                      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.correlationId                           | correlationId   | c3xOeEEQQCCA9sEx7-u6FA   | 0    1                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T15:38:23.717Z | 0    2                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.time                                    | time            | 1589297903717            | 0    3                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.owner.ownergeography.city               | city            | abc                      | 0    4    1    1                |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.owner.ownergeography.country            | country         | abc                      | 0    4    1    2                |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.owner.ownername.firstname               | firstname       | abc                      | 0    4    2    1                |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.owner.ownername.lastname                | lastname        | def                      | 0    4    2    2                |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.owner.clientApiKey                      | clientApiKey    | xxxxx                    | 0    4    3                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 1  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    1    1    1    1    1      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    1    1    1    1    2      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    1    1    1    1    3      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].correlationId[0].a | a               | b                        | 0    1    1    1    2    1    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].correlationId[1].c | c               | d                        | 0    1    1    1    2    2    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].correlationIds[0].correlationId[2].e | e               | f                        | 0    1    1    1    2    3    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].variantId                            | variantId       | 1278915                  | 0    1    2                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].utmCampaign                          | utmCampaign     |                          | 0    1    3                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[0].ua.os.major                          | ua.os.major     | 8                        | 0    1    4                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    2    1    1    1    1      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    2    1    1    1    2      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    2    1    1    1    3      |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].correlationId[0].a | a               | b                        | 0    2    1    1    2    1    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].correlationId[1].c | c               | d                        | 0    2    1    1    2    2    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].correlationIds[0].correlationId[2].e | e               | f                        | 0    2    1    1    2    3    1 |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].variantId                            | variantId       | 1278915                  | 0    2    2                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].utmCampaign                          | utmCampaign     |                          | 0    2    3                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 2  | $[1].ua.os.major                          | ua.os.major     | 8                        | 0    2    4                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 3  | $.correlationId                           | correlationId   | ls7XmuuiThWzktUeewqgWg   | 0    1                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 3  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T12:40:20.786Z | 0    2                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 3  | $.time                                    | time            | 1589287220786            | 0    3                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 3  | $.modifiedBy.clientApiKey                 | clientApiKey    | xxx                      | 0    4    2                     |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    | 3  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
    +----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
    

    The idea in short:

    • we use a recursive CTE to walk this down.
    • The query will test any fragment ([value] coming from OPENJSON) for being valid JSON.
    • If the fragment is valid, this walks deeper and deeper.
    • The column SortString is needed to get a final sort order.
    • The CAST() and COLLATE helps to avoid data type mismatch. Recursive CTEs are very picky with this...

    Hint: If you deal with bigger JSONs you might need to set OPTION (MAXRECURSION 0) at the end of your query.

    Enjoy :-)

    Something similar for XML

    Here is a similar answer about how to read an unknown XML.

    0 讨论(0)
提交回复
热议问题