问题
I want to generate a list of say length 10000 from two items ('yes','no'). And the code I haev does that. The problem is, it generates ~50% yes and 50% no. How can I modify this code so that I can set the percentage of time it selects yes. Suppose i want yes like 36.7% of the time. And then it should select the remaining 'no' the remaining 63.3% time. Code is below:
import random
category = ('yes','no')
length_of_field = 10000
print(length_of_field)
print(type(category))
category_attribute = [random.choice(category) for _ in range(length_of_field)]
print('\ncategory:')
print(len(category_attribute))
print(type(category_attribute))
from collections import Counter
a= Counter(category_attribute).keys()
b= Counter(category_attribute).values()
print(a,b)
回答1:
import numpy as np
alist = np.random.choice(["No","Yes"], 1000, p=[0.633, 0.367])
built-in
import random
alist = random.choices(["no", "yes"], weights=[0.633, 0.367], k=1000)
or
def generate_some_dist(p, n):
'''
p: 0~1, proba to generate yes
n: size
'''
a = []
for i in range(n):
if random.random() <= p:
a.append("yes")
else:
a.append("no")
return a
a = generate_some_dist(.367, 10000)
or
p = 0.367
n = 1000
a = ["yes" if random.random() <= p else "No" for _ in range(n) ]
来源:https://stackoverflow.com/questions/58346227/generating-a-random-list-from-a-tuple-but-being-able-to-select-percentage-of-eac