Optimal way to concatenate/aggregate strings

前端未结

关注

 7  1235

I\'m finding a way to aggregate strings from different rows into a single row. I\'m looking to do this in many different places, so having a function to facilitate this woul

相关标签:

7条回答

走了就别回头了

2020-11-22 11:48
You can use += to concatenate strings, for example:
```
declare @test nvarchar(max)
set @test = ''
select @test += name from names
```
if you select @test, it will give you all names concatenated
0 讨论(0)
发布评论:

提交评论
- 加载中...
清歌不尽

2020-11-22 11:52
For those of us who found this ~~and are not using Azure SQL Database~~:

STRING_AGG() in PostgreSQL, SQL Server 2017 and Azure SQL
https://www.postgresql.org/docs/current/static/functions-aggregate.html
https://docs.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql

GROUP_CONCAT() in MySQL
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat

(Thanks to @Brianjorden and @milanio for Azure update)

Example Code:
```
select Id
, STRING_AGG(Name, ', ') Names 
from Demo
group by Id
```
SQL Fiddle: http://sqlfiddle.com/#!18/89251/1
0 讨论(0)
发布评论:

提交评论
- 加载中...

迷失自我

2020-11-22 11:54

Although @serge answer is correct but i compared time consumption of his way against xmlpath and i found the xmlpath is so faster. I'll write the compare code and you can check it by yourself. This is @serge way:

DECLARE @startTime datetime2;
DECLARE @endTime datetime2;
DECLARE @counter INT;
SET @counter = 1;

set nocount on;

declare @YourTable table (ID int, Name nvarchar(50))

WHILE @counter < 1000
BEGIN
    insert into @YourTable VALUES (ROUND(@counter/10,0), CONVERT(NVARCHAR(50), @counter) + 'CC')
    SET @counter = @counter + 1;
END

SET @startTime = GETDATE()

;WITH Partitioned AS
(
    SELECT 
        ID,
        Name,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
        COUNT(*) OVER (PARTITION BY ID) AS NameCount
    FROM @YourTable
),
Concatenated AS
(
    SELECT ID, CAST(Name AS nvarchar) AS FullName, Name, NameNumber, NameCount FROM Partitioned WHERE NameNumber = 1

    UNION ALL

    SELECT 
        P.ID, CAST(C.FullName + ', ' + P.Name AS nvarchar), P.Name, P.NameNumber, P.NameCount
    FROM Partitioned AS P
        INNER JOIN Concatenated AS C ON P.ID = C.ID AND P.NameNumber = C.NameNumber + 1
)
SELECT 
    ID,
    FullName
FROM Concatenated
WHERE NameNumber = NameCount

SET @endTime = GETDATE();

SELECT DATEDIFF(millisecond,@startTime, @endTime)
--Take about 54 milliseconds

And this is xmlpath way:

DECLARE @startTime datetime2;
DECLARE @endTime datetime2;
DECLARE @counter INT;
SET @counter = 1;

set nocount on;

declare @YourTable table (RowID int, HeaderValue int, ChildValue varchar(5))

WHILE @counter < 1000
BEGIN
    insert into @YourTable VALUES (@counter, ROUND(@counter/10,0), CONVERT(NVARCHAR(50), @counter) + 'CC')
    SET @counter = @counter + 1;
END

SET @startTime = GETDATE();

set nocount off
SELECT
    t1.HeaderValue
        ,STUFF(
                   (SELECT
                        ', ' + t2.ChildValue
                        FROM @YourTable t2
                        WHERE t1.HeaderValue=t2.HeaderValue
                        ORDER BY t2.ChildValue
                        FOR XML PATH(''), TYPE
                   ).value('.','varchar(max)')
                   ,1,2, ''
              ) AS ChildValues
    FROM @YourTable t1
    GROUP BY t1.HeaderValue

SET @endTime = GETDATE();

SELECT DATEDIFF(millisecond,@startTime, @endTime)
--Take about 4 milliseconds

0 讨论(0)

不思量自难忘°

2020-11-22 11:57
Update: Ms SQL Server 2017+, Azure SQL Database

You can use: STRING_AGG.

Usage is pretty simple for OP's request:
```
SELECT id, STRING_AGG(name, ', ') AS names
FROM some_table
GROUP BY id
```
Read More

Well my old non-answer got rightfully deleted (left in-tact below), but if anyone happens to land here in the future, there is good news. They have implimented STRING_AGG() in Azure SQL Database as well. That should provide the exact functionality originally requested in this post with native and built in support. @hrobky mentioned this previously as a SQL Server 2016 feature at the time.

--- Old Post: Not enough reputation here to reply to @hrobky directly, but STRING_AGG looks great, however it is only available in SQL Server 2016 vNext currently. Hopefully it will follow to Azure SQL Datababse soon as well..
0 讨论(0)
发布评论:

提交评论
- 加载中...

情书的邮戳

2020-11-22 12:00

SOLUTION

The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.

;WITH Partitioned AS
(
    SELECT 
        ID,
        Name,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
        COUNT(*) OVER (PARTITION BY ID) AS NameCount
    FROM dbo.SourceTable
),
Concatenated AS
(
    SELECT 
        ID, 
        CAST(Name AS nvarchar) AS FullName, 
        Name, 
        NameNumber, 
        NameCount 
    FROM Partitioned 
    WHERE NameNumber = 1

    UNION ALL

    SELECT 
        P.ID, 
        CAST(C.FullName + ', ' + P.Name AS nvarchar), 
        P.Name, 
        P.NameNumber, 
        P.NameCount
    FROM Partitioned AS P
        INNER JOIN Concatenated AS C 
                ON P.ID = C.ID 
                AND P.NameNumber = C.NameNumber + 1
)
SELECT 
    ID,
    FullName
FROM Concatenated
WHERE NameNumber = NameCount

EXPLANATION

The approach boils down to three steps:

Number the rows using OVER and PARTITION grouping and ordering them as needed for the concatenation. The result is Partitioned CTE. We keep counts of rows in each partition to filter the results later.
Using recursive CTE (Concatenated) iterate through the row numbers (NameNumber column) adding Name values to FullName column.
Filter out all results but the ones with the highest NameNumber.

Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).

I've quickly tested the solution on SQL Server 2012 with the following data:

INSERT dbo.SourceTable (ID, Name)
VALUES 
(1, 'Matt'),
(1, 'Rocks'),
(2, 'Stylus'),
(3, 'Foo'),
(3, 'Bar'),
(3, 'Baz')

The query result:

ID          FullName
----------- ------------------------------
2           Stylus
3           Bar, Baz, Foo
1           Matt, Rocks

0 讨论(0)

你的背包

2020-11-22 12:13

I found Serge's answer to be very promising, but I also encountered performance issues with it as-written. However, when I restructured it to use temporary tables and not include double CTE tables, the performance went from 1 minute 40 seconds to sub-second for 1000 combined records. Here it is for anyone who needs to do this without FOR XML on older versions of SQL Server:

DECLARE @STRUCTURED_VALUES TABLE (
     ID                 INT
    ,VALUE              VARCHAR(MAX) NULL
    ,VALUENUMBER        BIGINT
    ,VALUECOUNT         INT
);

INSERT INTO @STRUCTURED_VALUES
SELECT   ID
        ,VALUE
        ,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY VALUE) AS VALUENUMBER
        ,COUNT(*) OVER (PARTITION BY ID)    AS VALUECOUNT
FROM    RAW_VALUES_TABLE;

WITH CTE AS (
    SELECT   SV.ID
            ,SV.VALUE
            ,SV.VALUENUMBER
            ,SV.VALUECOUNT
    FROM    @STRUCTURED_VALUES SV
    WHERE   VALUENUMBER = 1

    UNION ALL

    SELECT   SV.ID
            ,CTE.VALUE + ' ' + SV.VALUE AS VALUE
            ,SV.VALUENUMBER
            ,SV.VALUECOUNT
    FROM    @STRUCTURED_VALUES SV
    JOIN    CTE 
        ON  SV.ID = CTE.ID
        AND SV.VALUENUMBER = CTE.VALUENUMBER + 1

)
SELECT   ID
        ,VALUE
FROM    CTE
WHERE   VALUENUMBER = VALUECOUNT
ORDER BY ID
;

0 讨论(0)

1 2 下一页

Optimal way to concatenate/aggregate strings

Example Code: