missing-data | 易学教程

How extract complete dataset from Amelia package

阅读更多关于 How extract complete dataset from Amelia package

In mice package for extract complete dataset you can use complete() command as follow : install.packages("mice") library ("mice") imp1=mice(nhanes,10) fill1=complete(imp,1) fill2=complete(imp,2) fillall=complete(imp,"long") But can some one tell me how to extract complete dataset in Amelia package?? install.packages("Amelia") library ("Amelia") imp2= amelia(freetrade, m = 5, ts = "year", cs = "country") The str() function is always helpful here. You'll see that the complete datasets are stored in the imputations element of the object returned by amelia() : > str(imp2, 1) List of 12 $

R: populating and/or duplicating rows based upon other columns

阅读更多关于 R: populating and/or duplicating rows based upon other columns

问题 My question is based upon this question. I have a data as below. I want to fill cells by first looking down and then by looking up as long as the bom is same. In case of bom=A, I want to fill up rows as shown. But in case of bom=B, as the type_p column is different, I want to duplicate rows and feel the blanks bom=c(rep("A",4),rep("B",3)) Part=c("","lambda","beta","","tim","tom","") type_p=c("","sub","sub","","sub","pan","") ww=c(1,2,3,4,1,2,3) df=data.frame(bom,Part,type_p,ww) > df bom Part

interactions terms in multiple imputations (Amelia or other mi packages)

阅读更多关于 interactions terms in multiple imputations (Amelia or other mi packages)

I have a question about interaction terms in multiple imputations. My understanding is that the imputation model is supposed to include all information that is used in the later analysis including any transformations or interactions of variables (the Amelia user guide also makes this statement). But when I include the interaction term int=x1*x2 in the imputation, the imputed value for int is not equal to x1*x2 . For example, when I have a binary variable x2 and a continuous variable x1 , int should be zero when x2 is zero. That is not the case for the imputed values of int . So how do I treat

Add missing values in time series efficiently

阅读更多关于 Add missing values in time series efficiently

I have 500 datasets (panel data). In each I have a time series (week) across different shops (store). Within each shop, I would need to add missing time series observations. A sample of my data would be: store week value 1 1 50 1 3 52 1 4 10 2 1 4 2 4 84 2 5 2 which I would like to look like: store week value 1 1 50 1 2 0 1 3 52 1 4 10 2 1 4 2 2 0 2 3 0 2 4 84 2 5 2 I currently use the following code (which works, but takes very very long on my data): stores<-unique(mydata$store) for (i in 1:length(stores)){ mydata <- merge( expand.grid(week=min(mydata$week):max(mydata$week)), mydata, all=TRUE

Identifying rows in data.frame with only NA values in R

阅读更多关于 Identifying rows in data.frame with only NA values in R

问题 I have a data.frame with 15,000 observations of 34 ordinal and NA variables. I am performing clustering for a market segmentation study and need the rows with only NAs removed. After taking out the userID I got an error message saying to omit 2099 rows with only NAs before clustering. I found a link for removing rows with all NA values, but I need to identify which of the 2099 rows have all NA values. Here the link for the discussion removing rows with all NA values: Remove Rows with NAs in

missing post data by using HttpWebRequest

阅读更多关于 missing post data by using HttpWebRequest

I got a problem on posting data by using HttpWebRequest . There is a string(ie. key1=value1&key2=value2&key3=value3 ) and I have post it to a site (ie. www.*.com/edit), but ,I don't know why that sometimes it's nothing wrong , but sometimes ,the first key=value1 will be missing, only key2=value&key3=value3 that can find in HttpAnalyzer . public static string SubmitData(string Url, string FormData, CookieContainer _Cc, string ContentType) { Stream RequestStream = null, ResponseStream = null; StreamReader Sr = null; HttpWebRequest HRequest = (HttpWebRequest)WebRequest.Create(Url); try { HRequest

How do you make a heat map and cluster with NA values?

阅读更多关于 How do you make a heat map and cluster with NA values?

I am trying to make a heat map using my data however struggle to code it properly. My matrix is filled with log(x+1) values, this way I don't encounter log(0) errors however due to the nature of my data I have a bunch of 0 values and they mask any sort of trends the heat map could be showing. Because of that I want to colour any 0 values grey or black and then the rest of my data colour along a blue-white-red spectrum. Here is the coding I am using, RHeatmap <- read.delim("~/Desktop/RHeatmap.txt", row.names=1, stringsAsFactors = FALSE) my_palette <- colorRampPalette(c("blue", "white", "red"))

How can I get missing values recorded as NULL when importing from csv

阅读更多关于 How can I get missing values recorded as NULL when importing from csv

I have multiple, large, csv files, each of which has missing values in many places. When I import the csv file into SQLite, I would like to have the missing values recorded as NULL for the reason that another application expects missing data to be indicated by NULL. My current method does not produce the desired result. An example CSV file (test.csv) is: 12|gamma|17|delta 67||19|zeta 96|eta||theta 98|iota|29| The first line is complete; each of the other lines has (or is meant to show!) a single missing item. When I import using: .headers on .mode column .nullvalue NULL CREATE TABLE t ( id1

Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

阅读更多关于 Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

I am trying to run a multi-level model on multiply imputed data (created with Amelia); the sample is based on a clustered sample with group = 24, N= 150. library("ZeligMultilevel") ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed", data=a.out$imputations) summary(ML.model.0) This code produces the following error code: Error in object[[1]]$result$call : $ operator not defined for this S4 class If I run a OLS regression, it works: model.0 <- zelig(dv~1, model="ls", data=a.out$imputations) m.0 <- coef(summary(model.0)) print(m.0, digits = 2) Value Std. Error t-stat p-value [1,] 45 0.34

Imputing missing values using ARIMA model

阅读更多关于 Imputing missing values using ARIMA model

问题 I am trying to impute missing values in a time series with an ARIMA model in R. I tried this code but no success. x <- AirPassengers x[90:100] <- NA fit <- auto.arima(x) fitted(fit)[90:100] ## this is giving me NAs plot(x) lines(fitted(fit), col="red") The fitted model is not imputing the missing values. Any idea on how this is done? 回答1: fitted gives in-sample one-step forecasts. The "right" way to do what you want is via a Kalman smoother. A rough approximation good enough for most purposes