kernel-density

R - simulate data for probability density distribution obtained from kernel density estimate

馋奶兔 提交于 2019-12-04 14:53:31
First off, I'm not entirely sure if this is the correct place to be posting this, as perhaps it should go in a more statistics-focussed forum. However, as I'm planning to implement this with R, I figured it would be best to post it here. Please apologise if I'm wrong. So, what I'm trying to do is the following. I want to simulate data for a total of 250.000 observations, assigning a continuous (non-integer) value in line with a kernel density estimate derived from empirical data (discrete), with original values ranging from -5 to +5. Here's a plot of the distribution I want to use. It's quite

Find local minimum in bimodal distribution with r

主宰稳场 提交于 2019-12-04 14:12:33
问题 My data are pre-processed image data and I want to seperate two classes. In therory (and hopefully in practice) the best threshold is the local minimum between the two peaks in the bimodal distributed data. My testdata is: http://www.file-upload.net/download-9365389/data.txt.html I tried to follow this thread: I plotted the histogram and calculated the kernel density function: datafile <- read.table("....txt") data <- data$V1 hist(data) d <- density(data) # returns the density data with

How to plot a density estimate on top of the histogram? [duplicate]

风格不统一 提交于 2019-12-04 13:34:41
This question already has answers here : Closed 7 years ago . Possible Duplicate: Fitting a density curve to a histogram in R x is a NAs free numeric vector. I run: > hist(x,density(x), prob=TRUE) Error Message I get: Error in rank(x, ties.method = "min", na.last = "keep") : unimplemented type 'list' in 'greater' It was suggested that I set prob =TRUE when calling hist. If you can explain that as well, it will be great. Thank you. You need to call hist and density separately. Something like this: hist(x, prob=TRUE) lines(density(x)) 来源: https://stackoverflow.com/questions/12945951/how-to-plot

Thread error: can't start new thread

自作多情 提交于 2019-12-04 12:05:37
Here's a MWE of a much larger code I'm using. It performs a Monte Carlo integration over a KDE ( kernel density estimate ) for all values located below a certain threshold (the integration method was suggested over at this question: Integrate 2D kernel density estimate ) iteratively for a number of points in a list and returns a list made of these results. import numpy as np from scipy import stats from multiprocessing import Pool import threading # Define KDE integration function. def kde_integration(m_list): # Put some of the values from the m_list into two new lists. m1, m2 = [], [] for

Density plots with multiple groups

半腔热情 提交于 2019-12-04 10:06:31
I am trying to produce something similar to densityplot() from the lattice package , using ggplot2 after using multiple imputation with the mice package. Here is a reproducible example: require(mice) dt <- nhanes impute <- mice(dt, seed = 23109) x11() densityplot(impute) Which produces: I would like to have some more control over the output (and I am also using this as a learning exercise for ggplot). So, for the bmi variable, I tried this: bar <- NULL for (i in 1:impute$m) { foo <- complete(impute,i) foo$imp <- rep(i,nrow(foo)) foo$col <- rep("#000000",nrow(foo)) bar <- rbind(bar,foo) } imp <

Exact kernel density value for any point in R [duplicate]

ε祈祈猫儿з 提交于 2019-12-03 20:39:00
This question already has answers here : Closed 2 years ago . Find the probability density of a new data point using “density” function in R (3 answers) Density Value for each Return (3 answers) I was wondering if there is a R base way to obtain the exact kernel density at any point desired? As an example, how can I get the exact kernel density at the 3 following points -2, 0, +2 on X-Axis in a plot like below? set.seed(2937107) plot( density(rnorm(1e4)) ) Use linear interpolation to find it. d <- density(rnorm(10000)) approx(d$x, d$y, xout = c(-2, 0, 2)) The precision of interpolation can be

How to plot kernel density plot of dates in Pandas?

≡放荡痞女 提交于 2019-12-03 17:47:25
问题 I have a pandas dataframe where each observation has a date (as a column of entries in datetime[64] format). These dates are spread over a period of about 5 years. I would like to plot a kernel-density plot of the dates of all the observations, with the years labelled on the x-axis. I have figured out how to create a time-delta relative to some reference date and then create a density plot of the number of hours/days/years between each observation and the reference date: df['relativeDate']

Getting values from kernel density estimation in R

≯℡__Kan透↙ 提交于 2019-12-03 07:35:11
问题 I am trying to get density estimates for the log of stock prices in R. I know I can plot it using plot(density(x)) . However, I actually want values for the function. I'm trying to implement the kernel density estimation formula. Here's what I have so far: a <- read.csv("boi_new.csv", header=FALSE) S = a[,3] # takes column of increments in stock prices dS=S[!is.na(S)] # omits first empty field N = length(dS) # Sample size rseed = 0 # Random seed x = rep(c(1:5),N/5) # Inputted data set.seed

How to plot kernel density plot of dates in Pandas?

末鹿安然 提交于 2019-12-03 06:27:39
I have a pandas dataframe where each observation has a date (as a column of entries in datetime[64] format). These dates are spread over a period of about 5 years. I would like to plot a kernel-density plot of the dates of all the observations, with the years labelled on the x-axis. I have figured out how to create a time-delta relative to some reference date and then create a density plot of the number of hours/days/years between each observation and the reference date: df['relativeDate'].astype('timedelta64[D]').plot(kind='kde') But this isn't exactly what I want: If I convert to year-deltas

Getting values from kernel density estimation in R

夙愿已清 提交于 2019-12-02 21:04:29
I am trying to get density estimates for the log of stock prices in R. I know I can plot it using plot(density(x)) . However, I actually want values for the function. I'm trying to implement the kernel density estimation formula. Here's what I have so far: a <- read.csv("boi_new.csv", header=FALSE) S = a[,3] # takes column of increments in stock prices dS=S[!is.na(S)] # omits first empty field N = length(dS) # Sample size rseed = 0 # Random seed x = rep(c(1:5),N/5) # Inputted data set.seed(rseed) # Sets random seed for reproducibility QL <- function(dS){ h = density(dS)$bandwidth r = log(dS^2)