random-sample | 易学教程

Sampling without replacement with unequal probabilites — linear run time possible?

阅读更多关于 Sampling without replacement with unequal probabilites — linear run time possible?

问题 In search for a faster weighted sampling without replacement, the following question came up: Is there an algorithm that implements random sampling without replacement with unequal selection probabilities using linear time in the size of the input? An O(n log n) implementation has been suggested in an answer to this question -- can this be improved? 来源： https://stackoverflow.com/questions/15114898/sampling-without-replacement-with-unequal-probabilites-linear-run-time-possib

How can I get exactly n random lines from a file with Perl?

阅读更多关于 How can I get exactly n random lines from a file with Perl?

问题 Following up on this question, I need to get exactly n lines at random out of a file (or stdin ). This would be similar to head or tail , except I want some from the middle. Now, other than looping over the file with the solutions to the linked question, what's the best way to get exactly n lines in one run? For reference, I tried this: #!/usr/bin/perl -w use strict; my $ratio = shift; print $ratio, "\n"; while () { print if ((int rand $ratio) == 1); } where $ratio is the rough percentage of

How to randomly select multiple small and non-overlapping matrices from a large matrix?

阅读更多关于 How to randomly select multiple small and non-overlapping matrices from a large matrix?

问题 Let's say I've a large N x M -sized matrix A (e.g. 1000 x 1000). Selecting k random elements without replacement from A is relatively straightforward in MATLAB: A = rand(1000,1000); % Generate random data k = 5; % Number of elements to be sampled sizeA = numel(A); % Number of elements in A idx = randperm(sizeA); % Random permutation B = A(idx(1:k)); % Random selection of k elements from A However, I'm looking for a way to expand the above concept so that I could randomly select k non

Weighted sampling with 2 vectors

阅读更多关于 Weighted sampling with 2 vectors

问题 (1) I have two column vectors. Eg. x = [283167.778 *289387.207 289705.322] y = [9121643.314 9098348.666* 9099832.621] (2) I'd like to make a weighted random sampling using these vectors: when I'll select the element 289387.207 in vector x, necessarily I'll choose the element 9098348.666 in vector y. (3) Also, I have the weighted w vector for each element in vector x and y. How can I implement this in MatLab? Thanks! 回答1: For random selection: sel_idx= randi(3); outputx = x(sel_idx); outputy =

Creating Sets of Samples From Given dataframe using condition R

阅读更多关于 Creating Sets of Samples From Given dataframe using condition R

问题 I have input table having more than 750 K raws. It has a field called quarter. I want to create sample such that I get 10% records from each quarter. Main attributes of the data.frame are: "SERIAL_NBR" "MODELNO" "War.Start.Monthly" "Start.Qua.Yr" is the field where quarter is mentioned. Is there any way through which I can generate sample data which has data(10% of record) for each quarter? Using sample function I can get sample regardless of the quarter. Code for the same will be: raw_claim

Select element from collection with probability proportional to element value

阅读更多关于 Select element from collection with probability proportional to element value

问题 I have a list of vertices, from which I have to pick a random vertex with probability proportional to deg(v), where deg(v) is a vertex degree. The pseudo code for this operation look like that: Select u ∈ L with probability deg(u) / Sigma ∀v∈L deg(v) Where u is the randomly selected vertex, L is the list of vertices and v is a vertex in L. The problem is that I don't understand how to do it. Can someone explain to me, how to get this random vertex. I would greatly appreciate if someone can

Weighted sampling with replacement in Java

阅读更多关于 Weighted sampling with replacement in Java

问题 Is there a function in Java, or in a library such as Apache Commons Math which is equivalent to the MATLAB function randsample? More specifically, I want to find a function randSample which returns a vector of Independent and Identically Distributed random variables according to the probability distribution which I specify. For example: int[] a = randSample(new int[]{0, 1, 2}, 5, new double[]{0.2, 0.3, 0.5}) // { 0 w.p. 0.2 // a[i] = { 1 w.p. 0.3 // { 2 w.p. 0.5 The output is the same as the

Boost random::discrete_distribution How to change weights once constructed?

阅读更多关于 Boost random::discrete_distribution How to change weights once constructed?

问题 Ok, it is possible to give weights/probabilities in boost::random::discrete_distribution. e.g. double probabilities[] = { 0.5, 0.1, 0.1, 0.1, 0.1, 0.1 }; boost::random::discrete_distribution<> dist (probabilities); Question: Once the object dist is constructed (1)How to change one of the weights e.g. 0.5 to 0.3? (2) How to reassign all the weights at once? 回答1: Create a new distribution object and use that instead. 来源： https://stackoverflow.com/questions/8925545/boost-randomdiscrete

Shuffling a list with a constraint

阅读更多关于 Shuffling a list with a constraint

问题 Preparing a new psychophysical experiment, I have 48 original stimuli displayed 4 times (4 conditions), resulting in 192 trials. Trying to randomize the order of presentation during the experiment, I need to maximize the distance between the 4 display of the same original stimuli. Please consider : Table[{j, i}, {j, Range[48]}, {i, Range[4]}] Where j is the original stimuli number and i the condition Output Sample : {{1, 1}, {1, 2}, {1, 3}, {1, 4}, {2, 1}, {2, 2}, {2, 3}, {2, 4}, ... {47, 1},

Random sample table with Hive, but including matching rows

阅读更多关于 Random sample table with Hive, but including matching rows

问题 I have a large table containing a userID column and other user variable columns, and I would like to use Hive to extract a random sample of users based on their userID . Furthermore, sometimes these users will be on multiple rows and if a randomly selected userID is contained in other parts of the table I would like to extract those rows too. I had a look at the Hive sampling documentation and I see that something like this can be done to extract a 1% sample: SELECT * FROM source TABLESAMPLE