select 30 random rows where sum amount = x

前端 未结 7 1356
感情败类
感情败类 2020-12-10 02:39

I have a table

items
id int unsigned auto_increment primary key,
name varchar(255)
price DECIMAL(6,2)

I want to get at least 30 random ite

相关标签:
7条回答
  • 2020-12-10 03:17

    There is a solution if your product list satisfies the following assumption:

    You have products for all prices between 0.00 and 500.00. eg. 0.01, 0.02 etc to 499.99. or maybe 0.05, 0.10 etc to 499.95.

    The algorithm is based on the following:

    In a collection of n positive numbers that sum up to S, at least one of them will be less than S divided by n (S/n)

    In this case, the steps are:

    1. Select a product randomly where price < 500/30. Get its price, lets say X.
    2. Select a product randomly where price < (500 - X)/29. Get its price, assume Y.
    3. Select a product randomly where price < (500 - X - Y)/28.

    Repeat this 29 times and get 29 products. For the last product, select one where price = remaining price. (or price <= remaining price and order by price desc and hopefully you could get close enough).

    For the table items:

    Get random product max price:

    CREATE PROCEDURE getRandomProduct (IN maxPrice INT, OUT productId INT, productPrice DECIMAL(8,2))
    BEGIN
       DECLARE productId INT;
       SET productId = 0;
           SELECT id, price INTO productId, productPrice
           FROM items
           WHERE price < maxPrice
           ORDER BY RAND()
           LIMIT 1;
    END
    

    Get 29 random products:

    CREATE PROCEDURE get29products(OUT str, OUT remainingPrice DECIMAL(8,2))
    BEGIN
      DECLARE x INT;
      DECLARE id INT;
      DECLARE price DECIMAL(8,2);
      SET x = 30;
      SET str = '';
      SET remainingPrice = 500.00;
    
      REPEAT
        CALL getRandomProduct(remainingPrice/x, @id, @price);
        SET str = CONCAT(str,',', @id);
        SET x = x - 1;
        SET remainingPrice = remainingPrice - @price;
        UNTIL x <= 1
      END REPEAT;
    END
    

    Call the procedure:

    CALL `get29products`(@p0, @p1); SELECT @p0 AS `str`, @p1 AS `remainingPrice`;
    

    and in the end try to find the last product to get to 500.

    Alternatively, you could select 28 and use the solution on the linked question you provided to get a couple of products that sum to the remaining price.

    Note that duplicate products are allowed. To avoid duplicates, you could extend getRandomProduct with an additional IN parameter of the products already found and add a condition NOT IN to exclude them.

    Update: You could overcome the above limitation, so that you always find collections that sum to 500 by using a cron process as described at the 2nd section below.

    2nd section: Using a cron process

    Building on @Michael Zukowski `s suggestion, you could

    • create a table to hold the collections found
    • define a cron process that runs the above algorithm a number of times (in example 10 times) eg. every 5 min
    • if a collection is found that matches the sum, add it to the new table

    This way you can find collections that always sum exactly to 500. When a user makes a request, you could select a random collection from the new table.

    Even with a match rate of 20%, a cron process that runs the algorithm 10 times every 5 minutes in 24h you could more than 500 collections.

    Using a cron process has the following advantages and disadvantages in my opinion:

    Advantages

    • find exact matches
    • no process on client request
    • even with a low match rate, you can find several collections

    disadvantages

    • if the price data are updated frequently, you could have inconsistent results, maybe using a cron process is not gonna work.
    • have to discard or filter old collections
    • it will probably be not random per client, as different client will probably see the same collection.
    0 讨论(0)
  • 2020-12-10 03:18

    The closest answer I can provide is this

    set @cnt = 0;
    set @cursum = 0;
    set @cntchanged = 0;
    set @uqid = 1;
    set @maxsumid = 1;
    set @maxsum = 0;
    select 
        t.id,
        t.name,
        t.cnt
    from (
        select 
            id + 0 * if(@cnt = 30, (if(@cursum > @maxsum, (@maxsum := @cursum) + (@maxsumid := @uqid), 0)) + (@cnt := 0) + (@cursum := 0) + (@uqid := @uqid + 1), 0) id, 
            name,  
            @uqid uniq_id,
            @cursum := if(@cursum + price <= 500, @cursum + price + 0 * (@cntchanged := 1) + 0 * (@cnt := @cnt + 1), @cursum + 0 * (@cntchanged := 0)) as cursum, if(@cntchanged, @cnt, 0) as cnt  
        from (select id, name, price from items order by rand() limit 10000) as orig
    ) as t
    
    where t.cnt > 0 and t.uniq_id = @maxsumid
    ;
    

    So how it works? At first we select 10k randomly ordered rows from items. After it we sum prices of items until we reach 30 items with sum less than 500. When we find 30 items we repeat the process until we walk through all the 10k selected items. While finding these 30 items we save maximum found sum. So at the end we select 30 items with greatest sum (meaning the closest to the target 500). Not sure if that's what you originally wanted, but finding the exact sum of 500 would require too much effort on DB side.

    0 讨论(0)
  • 2020-12-10 03:20

    If you read the MySQL manual you might have seen the ORDER BY RAND() to randomize the the rows.

    This example works fine and is fast if you only when let's say 1000 rows. As soon as you have 10000 rows the overhead for sorting the rows becomes important. Don't forget: we only sort to throw nearly all the rows away.

    A great post handling several cases, from simple, to gaps, to non-uniform with gaps.

    Here is how you can do it perfectly :

    SELECT id, name, price
     FROM `items` AS i1 JOIN
        (SELECT CEIL(RAND() *
                     (SELECT MAX(id)
                        FROM `items`)) AS id) AS i2
     WHERE i1.id >= i2.id AND i1.price = 500
     ORDER BY i1.id ASC
    LIMIT 30;
    
    0 讨论(0)
  • 2020-12-10 03:23

    Depending on the average price and the price distribution you could try something like this:

    1. Randomly select a few items less than you want in total (e.g. 25). Retry until their total amount is less than x.

    2. Then use the concept linked in your question to find a combination that provides the remaining amount.

    0 讨论(0)
  • 2020-12-10 03:32

    If you want to be efficient stop wasting your time and go for eventual consitency. Create console script that does what you want to accomplish by any means necessary, then run this script in CRON or with any scheduling software once in a while.

    Having 100, 1000 visitors would you want your query to be executed every time? This is time and resource consuming. Randomly ordered queries cannot be cached by DBMS's too. Go for eventual consistency: create a table to hold that records and purge it each time, lock for writing, then load with new set, every 5 minutes for instance.

    At least this is how I do it in heavily loaded applications. In the code it's matter of running plain SELECT query.

    0 讨论(0)
  • 2020-12-10 03:33
    1. first select all values where sum = 500
    2. use mysql_query

    then do the following code

    $arr = array();
    $num = 0;
    while($row = mysqli_fetch_array($result))
    {
        array_push($arr,$row['id']);
    }
    $arr2= array();
    while(count($arr2!=30)
    {
        $cnt = random(0,count($arr));
        if(in_array($arr[$cnt],$arr2);
        {
            array_push($arr2,$arr[$cnt]);
        }
    }
    print_r($arr2);
    

    here $arr2 is the required array

    0 讨论(0)
提交回复
热议问题