问题
Problem: In total, I have 1-300 positions to fill in, I have 50 items, each item has 6 unique positions to choose (300 total positions). The average position for each of these items needs to be in range 145-155 (the closer to 150 the better)
Constraints: For each item, each of its 6 positions must fall within a list of fixed ranges (called runs), but not be within the same run,
e.g. The runs are: [(1,36), (37,73), (74,110), (111,148), (149,186), (187,225), (226,262), (263, 300)]
. So item 1 can have position 1, 38, 158, 198, 238, 271 - so not within the same run twice.
The positions chosen should be random - or as random as possible
The issue I'm having: Constraining the randomly selected positions to be an average of 150 (or very close) seems to be really difficult. I've tried various different implementations of the code but most will result in it hanging close to the end (due to not having enough positions to choose from), or don't get very close. My best attempt just involves if statements I've placed to at least try to restrict the range but it still only gets a range of around 130-170. This feels like a really poor method and I was wondering if there was a way to do it algorithmically instead of just pasta'ing if statements in hopes something will work.
My best attempt with random band-aid restrictions: https://pyfiddle.io/fiddle/dfc6dfd1-1e12-46df-957c-d7ae0e94fbe3/?i=true
^ as you can see the averages vary, with most things being in an acceptable range and some just being off/really off
I've spent several weeks on this but genuinely cannot think of anything to restrict it to an appropriate average (145-155) and was hoping for any help here
回答1:
This approach uses dynamic programming to randomly extract as good a group as it can 50 times. It then throws the imperfect ones back in with a few good ones and tries the same thing. It keeps doing this while relaxing the definition of perfect until it gets 50 acceptable groups.
It is slow. (About 20 seconds on my laptop.) But it frequently gets all groups to have the exact same average. Across a number of runs I did not see a single group out of the range [150.0, 151.0]
.
#! /usr/bin/env python3
import math
import collections
import random
BaseNode = collections.namedtuple('BaseNode', 'sum value ways run prev_node')
class Node(BaseNode):
def merge(self, other):
if self.sum != other.sum:
print(self)
print(other)
raise Exception('Can only merge nodes with the same sum.')
ways = self.ways + other.ways
if self.ways <= random.randint(1, ways):
return Node(sum=self.sum, value=self.value, ways=ways,
run=self.run, prev_node=self.prev_node)
else:
return Node(sum=other.sum, value=other.value, ways=ways,
run=other.run, prev_node=other.prev_node)
def extract_group(self):
values = []
node = self
while node.value is not None:
node.run.remove(node.value)
values.append(node.value)
node = node.prev_node
return sorted(values)
def random_next_group_from_runs (runs):
runs_by_count = {}
# organize by count
for run in runs:
count = len(run)
if count in runs_by_count:
runs_by_count[count].append(run)
else:
runs_by_count[count] = [run]
required_runs = []
optional_runs = []
largest = max(runs_by_count.keys())
if 6 < len(runs_by_count[largest]):
largest = largest + 1
else:
required_runs = runs_by_count[largest]
for count, these_runs in runs_by_count.items():
if count < largest:
optional_runs.extend(these_runs)
# We start with the empty sum.
node_by_sum_by_count = {0: {0: Node(sum=0, value=None, ways=1, run=None, prev_node=None)}}
# We update to use a value from each required run.
for run in required_runs:
new_node_by_sum_by_count = {}
for count, node_by_sum in node_by_sum_by_count.items():
if count+1 not in new_node_by_sum_by_count:
new_node_by_sum_by_count[count+1] = {}
new_node_by_sum = new_node_by_sum_by_count[count+1]
for s, node in node_by_sum.items():
for i in run:
new_node = Node(sum=s+i, value=i, ways=node.ways, run=run, prev_node=node)
if s+i not in new_node_by_sum:
new_node_by_sum[s+i] = new_node
else:
# This merge hides the random choice of which one to take.
new_node_by_sum[s+i] = new_node_by_sum[s+i].merge(new_node)
node_by_sum_by_count = new_node_by_sum_by_count
# We may use a value from each optional run.
for run in optional_runs:
new_node_by_sum_by_count = {}
for count, node_by_sum in node_by_sum_by_count.items():
# The options where we do not use this run.
if count not in new_node_by_sum_by_count:
new_node_by_sum_by_count[count] = {}
new_node_by_sum = new_node_by_sum_by_count[count]
for s, node in node_by_sum.items():
if s not in new_node_by_sum:
new_node_by_sum[s] = node
else:
new_node_by_sum[s] = new_node_by_sum[s].merge(node)
# The options where we do use this run.
if count+1 not in new_node_by_sum_by_count:
new_node_by_sum_by_count[count+1] = {}
new_node_by_sum = new_node_by_sum_by_count[count+1]
for s, node in node_by_sum.items():
for i in run:
new_node = Node(sum=s+i, value=i, ways=node.ways, run=run, prev_node=node)
if s+i not in new_node_by_sum:
new_node_by_sum[s+i] = new_node
else:
# This merge hides the random choice of which one to take.
new_node_by_sum[s+i] = new_node_by_sum[s+i].merge(new_node)
node_by_sum_by_count = new_node_by_sum_by_count
# Widening sums close to 903
avg = 903
def try_sum():
yield avg
i = 1
while True:
yield avg - i
yield avg + i
i = i+1
# We only want groups with 6 elements.
node_by_sum = node_by_sum_by_count[6]
for i in try_sum():
if i in node_by_sum:
return node_by_sum[i].extract_group()
runs = [
set(range(1, 37)),
set(range(37, 74)),
set(range(74, 111)),
set(range(111, 149)),
set(range(149, 187)),
set(range(187, 226)),
set(range(226, 263)),
set(range(263, 301)),
]
in_run = {}
for i in range(len(runs)):
for j in runs[i]:
in_run[j] = i
#runs = [ set(range(i*36+1, i*36+37)) for i in range(8) ]
groups = []
bad_groups = []
attempt = 0
while attempt == 0 or 0 < len(bad_groups):
attempt = attempt + 1
# We add a few groups to bad groups in principle.
for i in range(attempt):
if 20 < len(groups):
bad_groups.append(groups.pop())
for group in bad_groups:
for i in group:
runs[in_run[i]].add(i)
bad_groups = []
while len(groups) + len(bad_groups) < 50:
group = random_next_group_from_runs(runs)
if abs(sum(group) - 903) <= math.floor(attempt/5.0):
groups.append(group)
else:
bad_groups.append(group)
random.shuffle(groups)
for group in groups:
print([sum(group), group])
回答2:
Take an item, choose a random run and a random position inside that run. 'Put' the item in its place. The next run is non-random: take the 'mirrored' one. I mean if the first run was 0 - (1,36), then next is 7 - (263, 300). If the first is 5 - (187,225), then next is 2 - (74,110). The position in the 'mirrored' run can be random, but since run is pretty wide, I believe you should also divide run into sub-runs and mirror positions also.
import numpy as np
from random import randint
runs = np.array([(1,36), (37,73), (74,110), (111,148), (149,186), (187,225), (226,262), (263, 300)])
#number of runs
numOfRuns = len(runs)
#number of subruns in a run
numberOfSubruns = 4
#maximum error when 'mirroring'
maxMirroringError = 0
#number of items
numOfItems = 50
#number of positions, must be even
numOfPos = 6
#50 items, 6 positions each
items=np.zeros((numOfItems,numOfPos))
for ii in range(numOfItems):
#create an array containing "used" runs.
usedruns=np.zeros(numOfRuns,dtype=bool)
for jj in range(0,numOfPos,2):
while True:
#choose run index
run_ind = randint(0,numOfRuns-1)
#check if it is not used
if not usedruns[run_ind]:
break;
#make run used for this item
usedruns[run_ind] = True
#choose position index
pos_ind = randint(runs[run_ind][0],runs[run_ind][1])
#store the result
items[ii][jj] = pos_ind
#we need this adjustment in the case of maxMirroringError!=0
#or else we get an infinite loop
adjuster=0
while True:
#find mirrored run
mirrored_run_ind = numOfRuns - 1 - run_ind+ randint(-(maxMirroringError+adjuster),maxMirroringError+adjuster)
#apply constraints for mirrored_run_ind to be in [0,numOfRuns)
mirrored_run_ind = np.min([mirrored_run_ind,numOfRuns-1])
mirrored_run_ind = np.max([mirrored_run_ind,0])
adjuster +=1
if not usedruns[mirrored_run_ind]:
break;
#make run used for this item
usedruns[mirrored_run_ind] = True
#get size of a particular subrun
subrun_size = (runs[run_ind][1]-runs[run_ind][0]) // numberOfSubruns
#find subrun's index
subrun_ind = (pos_ind-runs[run_ind][0]) // subrun_size
#find mirrored subrun's index
mirrored_subrun_ind = numberOfSubruns - subrun_ind - 1
#apply constraints for mirrored_run_ind to be in [0,numberOfSubruns)
mirrored_subrun_ind = np.min([mirrored_subrun_ind,numberOfSubruns-1])
mirrored_subrun_ind = np.max([mirrored_subrun_ind,0])
#find mirrored pos
#lower bound
a = (runs[mirrored_run_ind][1]-runs[mirrored_run_ind][0]) * mirrored_subrun_ind // numberOfSubruns
#upper bound is lower bound + mirrored subrun size
b = (runs[mirrored_run_ind][1]-runs[mirrored_run_ind][0]) * (mirrored_subrun_ind + 1) // numberOfSubruns
#index itself
mirrored_pos_ind = runs[mirrored_run_ind][0] + randint(a,b)
#store the result
items[ii][jj+1] = mirrored_pos_ind
#check for negatives
if (items<0).any():
print ('negative value encountered!')
raise;
#check for same runs
storedruns=np.zeros(numOfRuns,dtype=bool)
for pos in items[ii]:
for kk in range(numOfRuns):
if pos>=runs[kk][0] and pos<=runs[kk][1]:
if storedruns[kk]:
print('Same runs for an item encountered!')
raise;
else:
storedruns[kk] = kk;
#print an average of position of an item
# print(np.average(items[ii,:]))
print('\n')
# print a standard deviation of average of items
print(np.std(np.average(items,axis=1)))
# print('\n')
# print(items)
You can change numberOfSubruns & maxMirroringError
and see what fits your case. Your case (total randomness) is numberOfSubruns=1, maxMirroringError = numOfRuns
(equals 8). If you set numberOfSubruns = 4, maxMirroringError = 0
, you will get average very close to 148.
For your convenience, I put standard deviation calculation of averaged positions to the end.
来源:https://stackoverflow.com/questions/59505520/how-to-constrain-items-with-multiple-randomly-selected-positions-so-that-the-ave