randomly sort a list with bias

问题

I have a list as follows:

i = [
        {'id': '1', 'P': 0.5},
        {'id': '2', 'P': 0.4},
        {'id': '3', 'P': 0.8},
        ...
        {'id': 'x', 'P': P(x)}
    ]

When I do the following:

i.sort(key=lambda x: x['P'], reverse=True)

The list gets sorted based on P where the element with the largest P is in front. but what if I want to make it seem randomized so that even a element with a small P value (with very small probability) can be the first item in the list?
Is it possible to implement that using the sort() function or do I have to write my own?

回答1:

As I alluded to in a comment, you could sort with a level of random bias by sampling a bias factor from a standard normal distribution. You can then add this bias (which is symmetrical either side of zero) to your P value.

import numpy as np

#Give some control over the level of rearrangement - larger equals more rearrangement
bias_factor = 0.5

i.sort(key=lambda x: x['P'] + np.random.randn()*bias_factor, reverse=True)

Or if you just want to use the standard library:

from random import gauss

#Give some control over the level of rearrangement - larger equals more rearrangement
sigma = 0.5

i.sort(key=lambda x: x['P'] + gauss(0, sigma), reverse=True)

回答2:

According to my understanding, you want to have the list sorted but still want some some randomness. A list cannot be sorted as well as random at the same time. The nearest can be achieved via something like

i.sort(key=lambda x: x['P']*random.triangular(0.6, 1.4), reverse=True)

This while ensuring that the order is not exactly random as well as not sorted in a similar way.

The values 0.6 and 1.4 can be changed depending on the deviation you want.

回答3:

Here is my take on it: Suppose that item 1 has similarity score 0.3 and item 2 has similarity score 0.4. Then, it seems reasonable that when choosing between these two items, item 1 should be before item 2 with probability 0.3 / (0.3 + 0.4). This way the higher-scored item is usually before the lower-scored. The following functions implement a variant of merge-sort which incorporates this idea:

import random

def pick(a,b,key = None):
    ka = key(a) if key else a
    kb = key(b) if key else b
    if ka+kb == 0:
        return random.int(0,1)
    else:
        p = ka/(ka+kb)
        return 0 if random.random() <= p else 1

def randMerge(xs,ys,key = None):
    merged = []
    i = j = 0
    while i < len(xs) and j < len(ys):
        k = pick(xs[i],ys[j],key)
        if k == 0:
            merged.append(xs[i])
            i += 1
        else:
            merged.append(ys[j])
            j += 1
    if i == len(xs):
        merged.extend(ys[j:])
    else:
        merged.extend(xs[i:])
    return merged

def randSort(items,key = None):
    if len(items) < 2:
        return items
    else:
        n = len(items)//2
        xs = items[:n]
        ys = items[n:]
        return randMerge(randSort(xs,key),randSort(ys,key),key)

For testing:

i = [
        {'id': '1', 'P': 0.5},
        {'id': '2', 'P': 0.4},
        {'id': '3', 'P': 0.8},
        {'id': '4', 'P': 0.9}
    ]

For example:

>>> for j in range(5): print(randSort(i,key = lambda x: x['P']))

[{'P': 0.5, 'id': '1'}, {'P': 0.9, 'id': '4'}, {'P': 0.8, 'id': '3'}, {'P': 0.4, 'id': '2'}]
[{'P': 0.8, 'id': '3'}, {'P': 0.5, 'id': '1'}, {'P': 0.9, 'id': '4'}, {'P': 0.4, 'id': '2'}]
[{'P': 0.9, 'id': '4'}, {'P': 0.5, 'id': '1'}, {'P': 0.8, 'id': '3'}, {'P': 0.4, 'id': '2'}]
[{'P': 0.5, 'id': '1'}, {'P': 0.8, 'id': '3'}, {'P': 0.9, 'id': '4'}, {'P': 0.4, 'id': '2'}]
[{'P': 0.8, 'id': '3'}, {'P': 0.4, 'id': '2'}, {'P': 0.9, 'id': '4'}, {'P': 0.5, 'id': '1'}]

来源：https://stackoverflow.com/questions/38455356/randomly-sort-a-list-with-bias

标签

python

list

sorting

random