random-sample

Randomly Pick Lines From a File Without Slurping It With Unix

…衆ロ難τιáo~ 提交于 2019-11-28 03:03:59
I have a 10^7 lines file, in which I want to choose 1/100 of lines randomly from the file. This is the AWK code I have, but it slurps all the file content before hand. My PC memory cannot handle such slurps. Is there other approach to do it? awk 'BEGIN{srand()} !/^$/{ a[c++]=$0} END { for ( i=1;i<=c ;i++ ) { num=int(rand() * c) if ( a[num] ) { print a[num] delete a[num] d++ } if ( d == c/100 ) break } }' file cadrian if you have that many lines, are you sure you want exactly 1% or a statistical estimate would be enough? In that second case, just randomize at 1% at each line... awk 'BEGIN

How to generate random numbers with no repeat javascript

℡╲_俬逩灬. 提交于 2019-11-28 02:17:17
I am using the following code which generates random number between 0 to Totalfriends, I would like to get the random numbers but they should not be repeated. Any idea how? This is the code I am using FB.getLoginStatus(function(response) { var profilePicsDiv = document.getElementById('profile_pics'); FB.api({ method: 'friends.get' }, function(result) { // var result =resultF.data; // console.log(result); var user_ids="" ; var totalFriends = result.length; // console.log(totalFriends); var numFriends = result ? Math.min(25, result.length) : 0; // console.log(numFriends); if (numFriends > 0) {

Can Random.nextgaussian() sample values from a distribution with different mean and standard deviation?

本秂侑毒 提交于 2019-11-27 17:23:50
问题 This is a combined Java and basic math question. The documentation from Random.nextGaussian() states that it samples from a normal distribution with mean 0 and standard deviation 1. What if I wanted to sample from a normal distribution with a different mean and variance? 回答1: The short answer is Random r = new Random(); double mySample = r.nextGaussian()*desiredStandardDeviation+desiredMean; For example this answer is given here: http://www.javamex.com/tutorials/random_numbers/gaussian

Fixing set.seed for an entire session

对着背影说爱祢 提交于 2019-11-27 15:39:15
问题 I am using R to construct an agent based model with a monte carlo process. This means I got many functions that use a random engine of some kind. In order to get reproducible results, I must fix the seed. But, as far as I understand, I must set the seed before every random draw or sample. This is a real pain in the neck. Is there a way to fix the seed? set.seed(123) print(sample(1:10,3)) # [1] 3 8 4 print(sample(1:10,3)) # [1] 9 10 1 set.seed(123) print(sample(1:10,3)) # [1] 3 8 4 回答1: There

from data table, randomly select one row per group

偶尔善良 提交于 2019-11-27 14:50:56
I'm looking for an efficient way to select rows from a data table such that I have one representative row for each unique value in a particular column. Let me propose a simple example: require(data.table) y = c('a','b','c','d','e','f','g','h') x = sample(2:10,8,replace = TRUE) z = rep(y,x) dt = as.data.table( z ) my objective is to subset data table dt by sampling one row for each letter a-h in column z. OP provided only a single column in the example. Assuming that there are multiple columns in the original dataset, we group by 'z', sample 1 row from the sequence of rows per group, get the

Generate a random sample of points distributed on the surface of a unit sphere

℡╲_俬逩灬. 提交于 2019-11-27 13:58:44
问题 I am trying to generate random points on the surface of the sphere using numpy. I have reviewed the post that explains uniform distribution here. However, need ideas on how to generate the points only on the surface of the sphere. I have coordinates (x, y, z) and the radius of each of these spheres. I am not very well-versed with Mathematics at this level and trying to make sense of the Monte Carlo simulation. Any help will be much appreciated. Thanks, Parin 回答1: Based on the last approach on

random sampling - matrix

天大地大妈咪最大 提交于 2019-11-27 08:00:54
问题 How can I take a sample of n random points from a matrix populated with 1's and 0's ? a=rep(0:1,5) b=rep(0,10) c=rep(1,10) dataset=matrix(cbind(a,b,c),nrow=10,ncol=3) dataset [,1] [,2] [,3] [1,] 0 0 1 [2,] 1 0 1 [3,] 0 0 1 [4,] 1 0 1 [5,] 0 0 1 [6,] 1 0 1 [7,] 0 0 1 [8,] 1 0 1 [9,] 0 0 1 [10,] 1 0 1 I want to be sure that the positions(row,col) from were I take the N samples are random. I know sample {base} but it doesn't seem to allow me to do that, other methods I know are spatial methods

Iterator to produce unique random order?

拜拜、爱过 提交于 2019-11-27 06:29:44
问题 The problem is stated as follows, we have a very large number of items which are traversed through an iterator pattern (which dynamicaly constructs or fetches) the requested item. Due to the number of items being large and thus cannot be kept in memory (as a list for example). What is a procedure for the iterator to follow in order to produce a random order of the items each time the iterator is called. A unique random order means that eventually all items are traversed only once but returned

Select a random sample of results from a query result

限于喜欢 提交于 2019-11-27 06:20:32
This question asks about getting a random(ish) sample of records on SQL Server and the answer was to use TABLESAMPLE . Is there an equivalent in Oracle 10? If there isn't, is there a standard way to get a random sample of results from a query set? For example how can one get 1,000 random rows from a query that will return millions normally? SELECT * FROM ( SELECT * FROM mytable ORDER BY dbms_random.value ) WHERE rownum <= 1000 grokster The SAMPLE clause will give you a random sample percentage of all rows in a table. For example, here we obtain 25% of the rows: SELECT * FROM emp SAMPLE(25) The

Binary random array with a specific proportion of ones?

做~自己de王妃 提交于 2019-11-27 03:42:53
What is the efficient(probably vectorized with Matlab terminology) way to generate random number of zeros and ones with a specific proportion? Specially with Numpy? As my case is special for 1/3 , my code is: import numpy as np a=np.mod(np.multiply(np.random.randomintegers(0,2,size)),3) But is there any built-in function that could handle this more effeciently at least for the situation of K/N where K and N are natural numbers? Yet another approach, using np.random.choice : >>> np.random.choice([0, 1], size=(10,), p=[1./3, 2./3]) array([0, 1, 1, 1, 1, 0, 0, 0, 0, 0]) A simple way to do this