How can SciKit-Learn Random Forest sub sample size may be equal to original training data size?
In the documentation of SciKit-Learn Random Forest classifier , it is stated that The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default). What I dont understand is that if the sample size is always the same as the input sample size than how can we talk about a random selection. There is no selection here because we use all the (and naturally the same) samples at each training. Am I missing something here? Lol4t0 I believe this part of docs answers your question In random forests (see RandomForestClassifier