random-sample

Sampling without replacement with unequal probabilites — linear run time possible?

只谈情不闲聊 提交于 2019-12-12 10:13:35
问题 In search for a faster weighted sampling without replacement, the following question came up: Is there an algorithm that implements random sampling without replacement with unequal selection probabilities using linear time in the size of the input? An O(n log n) implementation has been suggested in an answer to this question -- can this be improved? 来源: https://stackoverflow.com/questions/15114898/sampling-without-replacement-with-unequal-probabilites-linear-run-time-possib

How can I get exactly n random lines from a file with Perl?

孤街醉人 提交于 2019-12-12 08:54:00
问题 Following up on this question, I need to get exactly n lines at random out of a file (or stdin ). This would be similar to head or tail , except I want some from the middle. Now, other than looping over the file with the solutions to the linked question, what's the best way to get exactly n lines in one run? For reference, I tried this: #!/usr/bin/perl -w use strict; my $ratio = shift; print $ratio, "\n"; while () { print if ((int rand $ratio) == 1); } where $ratio is the rough percentage of

How to randomly select multiple small and non-overlapping matrices from a large matrix?

独自空忆成欢 提交于 2019-12-11 23:09:16
问题 Let's say I've a large N x M -sized matrix A (e.g. 1000 x 1000). Selecting k random elements without replacement from A is relatively straightforward in MATLAB: A = rand(1000,1000); % Generate random data k = 5; % Number of elements to be sampled sizeA = numel(A); % Number of elements in A idx = randperm(sizeA); % Random permutation B = A(idx(1:k)); % Random selection of k elements from A However, I'm looking for a way to expand the above concept so that I could randomly select k non

Weighted sampling with 2 vectors

我的未来我决定 提交于 2019-12-11 20:38:11
问题 (1) I have two column vectors. Eg. x = [283167.778 *289387.207 289705.322] y = [9121643.314 9098348.666* 9099832.621] (2) I'd like to make a weighted random sampling using these vectors: when I'll select the element 289387.207 in vector x, necessarily I'll choose the element 9098348.666 in vector y. (3) Also, I have the weighted w vector for each element in vector x and y. How can I implement this in MatLab? Thanks! 回答1: For random selection: sel_idx= randi(3); outputx = x(sel_idx); outputy =

Creating Sets of Samples From Given dataframe using condition R

懵懂的女人 提交于 2019-12-11 19:51:26
问题 I have input table having more than 750 K raws. It has a field called quarter. I want to create sample such that I get 10% records from each quarter. Main attributes of the data.frame are: "SERIAL_NBR" "MODELNO" "War.Start.Monthly" "Start.Qua.Yr" is the field where quarter is mentioned. Is there any way through which I can generate sample data which has data(10% of record) for each quarter? Using sample function I can get sample regardless of the quarter. Code for the same will be: raw_claim

Select element from collection with probability proportional to element value

一笑奈何 提交于 2019-12-11 04:17:00
问题 I have a list of vertices, from which I have to pick a random vertex with probability proportional to deg(v), where deg(v) is a vertex degree. The pseudo code for this operation look like that: Select u ∈ L with probability deg(u) / Sigma ∀v∈L deg(v) Where u is the randomly selected vertex, L is the list of vertices and v is a vertex in L. The problem is that I don't understand how to do it. Can someone explain to me, how to get this random vertex. I would greatly appreciate if someone can

Weighted sampling with replacement in Java

守給你的承諾、 提交于 2019-12-10 20:53:53
问题 Is there a function in Java, or in a library such as Apache Commons Math which is equivalent to the MATLAB function randsample? More specifically, I want to find a function randSample which returns a vector of Independent and Identically Distributed random variables according to the probability distribution which I specify. For example: int[] a = randSample(new int[]{0, 1, 2}, 5, new double[]{0.2, 0.3, 0.5}) // { 0 w.p. 0.2 // a[i] = { 1 w.p. 0.3 // { 2 w.p. 0.5 The output is the same as the

Boost random::discrete_distribution How to change weights once constructed?

南楼画角 提交于 2019-12-10 18:57:27
问题 Ok, it is possible to give weights/probabilities in boost::random::discrete_distribution. e.g. double probabilities[] = { 0.5, 0.1, 0.1, 0.1, 0.1, 0.1 }; boost::random::discrete_distribution<> dist (probabilities); Question: Once the object dist is constructed (1)How to change one of the weights e.g. 0.5 to 0.3? (2) How to reassign all the weights at once? 回答1: Create a new distribution object and use that instead. 来源: https://stackoverflow.com/questions/8925545/boost-randomdiscrete

Shuffling a list with a constraint

删除回忆录丶 提交于 2019-12-10 13:54:42
问题 Preparing a new psychophysical experiment, I have 48 original stimuli displayed 4 times (4 conditions), resulting in 192 trials. Trying to randomize the order of presentation during the experiment, I need to maximize the distance between the 4 display of the same original stimuli. Please consider : Table[{j, i}, {j, Range[48]}, {i, Range[4]}] Where j is the original stimuli number and i the condition Output Sample : {{1, 1}, {1, 2}, {1, 3}, {1, 4}, {2, 1}, {2, 2}, {2, 3}, {2, 4}, ... {47, 1},

Random sample table with Hive, but including matching rows

蓝咒 提交于 2019-12-10 11:55:42
问题 I have a large table containing a userID column and other user variable columns, and I would like to use Hive to extract a random sample of users based on their userID . Furthermore, sometimes these users will be on multiple rows and if a randomly selected userID is contained in other parts of the table I would like to extract those rows too. I had a look at the Hive sampling documentation and I see that something like this can be done to extract a 1% sample: SELECT * FROM source TABLESAMPLE