Generate a set of random unique integers from an interval

后端 未结 3 1571
离开以前
离开以前 2021-02-03 21:31

I am trying to build some machine learning models,

so i need a training data and a validation data

so suppose I have N number of examples, I want to select rando

相关标签:
3条回答
  • 2021-02-03 21:57

    from the raster package:

    raster::sampleInt(242, 10, replace = FALSE)
    ##  95 230 148 183  38  98 137 110 188  39
    

    This may fail if the limits are too large:

    sample.int(1e+12, 10)
    
    0 讨论(0)
  • 2021-02-03 22:03

    sample (or sample.int) does this:

    sample.int(100, 10)
    # [1] 58 83 54 68 53  4 71 11 75 90
    

    will generate ten random numbers from the range 1–100. You probably want replace = TRUE, which samples with replacing:

    sample.int(20, 10, replace = TRUE)
    # [1] 10  2 11 13  9  9  3 13  3 17
    

    More generally, sample samples n observations from a vector of arbitrary values.

    0 讨论(0)
  • 2021-02-03 22:16

    If I understand correctly, you are trying to create a hold-out sampling. This is usually done using probabilities. So if you have n.rows samples and want a fraction of training.fraction to be used for training, you may do something like this:

    select.training <- runif(n=n.rows) < training.fraction
    data.training <- my.data[select.training, ]
    data.testing <- my.data[!select.training, ]
    

    If you want to specify EXACT number of training cases, you may do something like:

    indices.training <- sample(x=seq(n.rows), size=training.size, replace=FALSE) #replace=FALSE makes sure the indices are unique
    data.training <- my.data[indices.training, ]
    data.testing <- my.data[-indices.training, ] #note that index negation means "take everything except for those"
    
    0 讨论(0)
提交回复
热议问题