Mapping fields of weakly related tables in SQL

南楼画角 提交于 2019-12-07 08:32:46

问题


I'm searching for an SQL-Query that can map a set of items of an individual size to a set off buckets of individual size.

I would like to satisfy the following conditions:

  • The size of a bucket has to be bigger or equal the size of an item.
  • Every bucket can contain only one item or it is left empty.
  • Every item can only be placed in one bucket.
  • No item can be split to multiple buckets.
  • I want to fill the buckets in a way, that the smallest unused buckets are filled first.
  • Then initial item and bucket sets can be ordered by size or id, but are not incremental
  • Sizes and ids of initial bucket and item sets can be arbitrary and do not start at a known minimum value
  • The result has to be always correct, when there is a valid mapping
  • The result is allowed to be incorrect if the is no valid mapping (for example if there are more items than buckets), but I would appreciate, when the result is an empty set or has another property/signal that indicates an incorrect result.

To give you an example, let's say my bucket and items tables look like that:

Bucket:                     Item:
+---------------------+     +---------------------+
| BucketID | Size     |     | ItemID   | Size     |
+---------------------+     +---------------------+
| 1        | 2        |     | 1        | 2        |
| 2        | 2        |     | 2        | 2        |
| 3        | 2        |     | 3        | 5        |
| 4        | 4        |     | 4        | 11       |
| 5        | 4        |     | 5        | 12       |
| 6        | 7        |     +---------------------+
| 7        | 9        |
| 8        | 11       |
| 9        | 11       |
| 10       | 12       |
+---------------------+

Then, I'd like to have a mapping that is returning the following result table:

Result:
+---------------------+
| BucketID | ItemID   |
+---------------------+
| 1        | 1        |
| 2        | 2        |
| 3        | NULL     |
| 4        | NULL     |
| 5        | NULL     |
| 6        | 3        |
| 7        | NULL     |
| 8        | 4        |
| 9        | NULL     |
| 10       | 5        |
+---------------------+

Since there is no foreign key relation or something I could fix the columns to their corresponding bucket (but only the relation Bucket.Size >= Item.Size) I'm have a lot of trouble describing the result with a valid SQL query. Whenever I use joins or sub selects, I get items in buckets, that are to big (like having an item of size 2 in a bucket of size 12, while a bucket of size 2 is still available) or I get the same item in multiple buckets.

I spent some time now to find the solution myself and I am close to say, that it is better not to declare the problem in SQL but in an application, that is just fetching the tables.

Do you think this task is possible in SQL? And if so, I would really appreciate if you can help me out with a working query.

Edit : The query should be compatible to at least Oracle, Postgres and SQLite databases

Edit II: An SQL Fiddle with the given test set above an example query, that returns a wrong result, but is close, to what the result could look like http://sqlfiddle.com/#!15/a6c30/1


回答1:


Using the table definition from @SoulTrain (but requiring the data to be sorted in advance):

; WITH ORDERED_PAIRINGS AS (
    SELECT i.ITEMID, b.BUCKETID, ROW_NUMBER() OVER (ORDER BY i.SIZE, i.ITEMID, b.SIZE, b.BUCKETID) AS ORDERING, DENSE_RANK() OVER (ORDER BY b.SIZE, b.BUCKETID) AS BUCKET_ORDER, DENSE_RANK() OVER (PARTITION BY b.BUCKETID ORDER BY i.SIZE, i.ITEMID) AS ITEM_ORDER
    FROM @ITEM i
    JOIN @BUCKET b
      ON i.SIZE <= b.SIZE
), ITEM_PLACED AS (
    SELECT ITEMID, BUCKETID, ORDERING, BUCKET_ORDER, ITEM_ORDER, CAST(1 as int) AS SELECTION
    FROM ORDERED_PAIRINGS
    WHERE ORDERING = 1
    UNION ALL
    SELECT *
    FROM (
        SELECT op.ITEMID, op.BUCKETID, op.ORDERING, op.BUCKET_ORDER, op.ITEM_ORDER, CAST(ROW_NUMBER() OVER(ORDER BY op.BUCKET_ORDER) as int) as SELECTION
        FROM ORDERED_PAIRINGS op
        JOIN ITEM_PLACED ip
          ON op.ITEM_ORDER = ip.ITEM_ORDER + 1
         AND op.BUCKET_ORDER > ip.BUCKET_ORDER
    ) AS sq
    WHERE SELECTION = 1
)
SELECT *
FROM ITEM_PLACED



回答2:


Try this... I was able to implement this using recursive CTE, all in 1 single SQL statement

The only assumption I had was that the Bucket and Item data set are sorted.

DECLARE @BUCKET TABLE
    (
     BUCKETID INT
     , SIZE INT
    )

    DECLARE @ITEM TABLE
    (
     ITEMID INT
     , SIZE INT
    )
    ;  
    INSERT INTO @BUCKET
    SELECT 1,2 UNION ALL
    SELECT 2,2 UNION ALL
    SELECT 3,2 UNION ALL
    SELECT 4,4 UNION ALL
    SELECT 5,4 UNION ALL
    SELECT 6,7 UNION ALL
    SELECT 7,9 UNION ALL
    SELECT 8, 11 UNION ALL
    SELECT 9, 11 UNION ALL
    SELECT 10,12 

    INSERT INTO @ITEM
    SELECT 1,2 UNION ALL
    SELECT 2,2 UNION ALL
    SELECT 3,5 UNION ALL
    SELECT 4,11 UNION ALL
    SELECT 5,12;

    WITH TOTAL_BUCKETS
    AS (
        SELECT MAX(BUCKETID) CNT
        FROM @BUCKET
        ) -- TO GET THE TOTAL BUCKETS COUNT TO HALT THE RECURSION
        , CTE
    AS (
        --INVOCATION PART
        SELECT BUCKETID
            , (
                SELECT MIN(ITEMID)
                FROM @ITEM I2
                WHERE I2.SIZE <= (
                        SELECT SIZE
                        FROM @BUCKET
                        WHERE BUCKETID = (1)
                        )
                ) ITEMID --PICKS THE FIRST ITEM ID MATCH FOR THE BUCKET SIZE
            , BUCKETID + 1 NEXT_BUCKETID --INCREMENT FOR NEXT BUCKET ID 
            , (
                SELECT ISNULL(MIN(ITEMID), 0)
                FROM @ITEM I2
                WHERE I2.SIZE <= (
                        SELECT SIZE
                        FROM @BUCKET
                        WHERE BUCKETID = (1)
                        )
                ) --PICK FIRST ITEM ID MATCH
            + (
                CASE 
                    WHEN (
                            SELECT ISNULL(MIN(ITEMID), 0)
                            FROM @ITEM I3
                            WHERE I3.SIZE <= (
                                    SELECT SIZE
                                    FROM @BUCKET
                                    WHERE BUCKETID = (1)
                                    )
                            ) IS NOT NULL
                        THEN 1
                    ELSE 0
                    END
                ) NEXT_ITEMID --IF THE ITEM IS PLACED IN THE BUCKET THEN INCREMENTS THE FIRST ITEM ID
            , (
                SELECT SIZE
                FROM @BUCKET
                WHERE BUCKETID = (1 + 1)
                ) NEXT_BUCKET_SIZE --STATES THE NEXT BUCKET SIZE
        FROM @BUCKET B
        WHERE BUCKETID = 1

        UNION ALL

        --RECURSIVE PART
        SELECT NEXT_BUCKETID BUCKETID
            , (
                SELECT ITEMID
                FROM @ITEM I2
                WHERE I2.SIZE <= NEXT_BUCKET_SIZE
                    AND I2.ITEMID = NEXT_ITEMID
                ) ITEMID -- PICKS THE ITEM ID IF IT IS PLACED IN THE BUCKET
            , NEXT_BUCKETID + 1 NEXT_BUCKETID --INCREMENT FOR NEXT BUCKET ID 
            , NEXT_ITEMID + (
                CASE 
                    WHEN (
                            SELECT I3.ITEMID
                            FROM @ITEM I3
                            WHERE I3.SIZE <= NEXT_BUCKET_SIZE
                                AND I3.ITEMID = NEXT_ITEMID
                            ) IS NOT NULL
                        THEN 1
                    ELSE 0
                    END
                ) NEXT_ITEMID --IF THE ITEM IS PLACED IN THE BUCKET THEN INCREMENTS THE CURRENT ITEM ID
            , (
                SELECT SIZE
                FROM @BUCKET
                WHERE BUCKETID = (NEXT_BUCKETID + 1)
                ) NEXT_BUCKET_SIZE --STATES THE NEXT BUCKET SIZE
        FROM CTE
        WHERE NEXT_BUCKETID <= (
                SELECT CNT
                FROM TOTAL_BUCKETS
                ) --HALTS THE RECURSION
        )
    SELECT 
        BUCKETID
        , ITEMID
    FROM CTE



回答3:


I'd say that a single SQL query is maybe not the tool for the job as the buckets are "consumed" by allocating items to them. You can use SQL, but not a single query. suggestion in pseudocode below:

Have a cursor on ITEM: 
    within the FETCH loop for that {
    SELECT in BUCKET the bucket with minimum bucket id and bucket size >= item size 
    INSERT bucket id, item id to MAPPING
}

If you need the NULL (unoccupied) buckets, you can locate them via a further 
INSERT into MAPPING (....)
SELECT <bucket id>, NULL
from    BUCKET 
where <bucket id> not in (SELECT <bucket id> from MAPPING);


来源:https://stackoverflow.com/questions/22539962/mapping-fields-of-weakly-related-tables-in-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!