missing-data

How extract complete dataset from Amelia package

两盒软妹~` 提交于 2019-12-06 05:10:00
In mice package for extract complete dataset you can use complete() command as follow : install.packages("mice") library ("mice") imp1=mice(nhanes,10) fill1=complete(imp,1) fill2=complete(imp,2) fillall=complete(imp,"long") But can some one tell me how to extract complete dataset in Amelia package?? install.packages("Amelia") library ("Amelia") imp2= amelia(freetrade, m = 5, ts = "year", cs = "country") The str() function is always helpful here. You'll see that the complete datasets are stored in the imputations element of the object returned by amelia() : > str(imp2, 1) List of 12 $

R: populating and/or duplicating rows based upon other columns

99封情书 提交于 2019-12-06 03:04:37
问题 My question is based upon this question. I have a data as below. I want to fill cells by first looking down and then by looking up as long as the bom is same. In case of bom=A, I want to fill up rows as shown. But in case of bom=B, as the type_p column is different, I want to duplicate rows and feel the blanks bom=c(rep("A",4),rep("B",3)) Part=c("","lambda","beta","","tim","tom","") type_p=c("","sub","sub","","sub","pan","") ww=c(1,2,3,4,1,2,3) df=data.frame(bom,Part,type_p,ww) > df bom Part

interactions terms in multiple imputations (Amelia or other mi packages)

旧巷老猫 提交于 2019-12-06 02:09:36
I have a question about interaction terms in multiple imputations. My understanding is that the imputation model is supposed to include all information that is used in the later analysis including any transformations or interactions of variables (the Amelia user guide also makes this statement). But when I include the interaction term int=x1*x2 in the imputation, the imputed value for int is not equal to x1*x2 . For example, when I have a binary variable x2 and a continuous variable x1 , int should be zero when x2 is zero. That is not the case for the imputed values of int . So how do I treat

Add missing values in time series efficiently

空扰寡人 提交于 2019-12-05 21:16:49
I have 500 datasets (panel data). In each I have a time series (week) across different shops (store). Within each shop, I would need to add missing time series observations. A sample of my data would be: store week value 1 1 50 1 3 52 1 4 10 2 1 4 2 4 84 2 5 2 which I would like to look like: store week value 1 1 50 1 2 0 1 3 52 1 4 10 2 1 4 2 2 0 2 3 0 2 4 84 2 5 2 I currently use the following code (which works, but takes very very long on my data): stores<-unique(mydata$store) for (i in 1:length(stores)){ mydata <- merge( expand.grid(week=min(mydata$week):max(mydata$week)), mydata, all=TRUE

Identifying rows in data.frame with only NA values in R

放肆的年华 提交于 2019-12-05 18:09:38
问题 I have a data.frame with 15,000 observations of 34 ordinal and NA variables. I am performing clustering for a market segmentation study and need the rows with only NAs removed. After taking out the userID I got an error message saying to omit 2099 rows with only NAs before clustering. I found a link for removing rows with all NA values, but I need to identify which of the 2099 rows have all NA values. Here the link for the discussion removing rows with all NA values: Remove Rows with NAs in

missing post data by using HttpWebRequest

半城伤御伤魂 提交于 2019-12-05 16:18:01
I got a problem on posting data by using HttpWebRequest . There is a string(ie. key1=value1&key2=value2&key3=value3 ) and I have post it to a site (ie. www.*.com/edit), but ,I don't know why that sometimes it's nothing wrong , but sometimes ,the first key=value1 will be missing, only key2=value&key3=value3 that can find in HttpAnalyzer . public static string SubmitData(string Url, string FormData, CookieContainer _Cc, string ContentType) { Stream RequestStream = null, ResponseStream = null; StreamReader Sr = null; HttpWebRequest HRequest = (HttpWebRequest)WebRequest.Create(Url); try { HRequest

How do you make a heat map and cluster with NA values?

南笙酒味 提交于 2019-12-05 13:01:42
I am trying to make a heat map using my data however struggle to code it properly. My matrix is filled with log(x+1) values, this way I don't encounter log(0) errors however due to the nature of my data I have a bunch of 0 values and they mask any sort of trends the heat map could be showing. Because of that I want to colour any 0 values grey or black and then the rest of my data colour along a blue-white-red spectrum. Here is the coding I am using, RHeatmap <- read.delim("~/Desktop/RHeatmap.txt", row.names=1, stringsAsFactors = FALSE) my_palette <- colorRampPalette(c("blue", "white", "red"))

How can I get missing values recorded as NULL when importing from csv

你。 提交于 2019-12-05 12:01:07
I have multiple, large, csv files, each of which has missing values in many places. When I import the csv file into SQLite, I would like to have the missing values recorded as NULL for the reason that another application expects missing data to be indicated by NULL. My current method does not produce the desired result. An example CSV file (test.csv) is: 12|gamma|17|delta 67||19|zeta 96|eta||theta 98|iota|29| The first line is complete; each of the other lines has (or is meant to show!) a single missing item. When I import using: .headers on .mode column .nullvalue NULL CREATE TABLE t ( id1

Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

*爱你&永不变心* 提交于 2019-12-05 07:28:32
I am trying to run a multi-level model on multiply imputed data (created with Amelia); the sample is based on a clustered sample with group = 24, N= 150. library("ZeligMultilevel") ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed", data=a.out$imputations) summary(ML.model.0) This code produces the following error code: Error in object[[1]]$result$call : $ operator not defined for this S4 class If I run a OLS regression, it works: model.0 <- zelig(dv~1, model="ls", data=a.out$imputations) m.0 <- coef(summary(model.0)) print(m.0, digits = 2) Value Std. Error t-stat p-value [1,] 45 0.34

Imputing missing values using ARIMA model

让人想犯罪 __ 提交于 2019-12-05 04:57:24
问题 I am trying to impute missing values in a time series with an ARIMA model in R. I tried this code but no success. x <- AirPassengers x[90:100] <- NA fit <- auto.arima(x) fitted(fit)[90:100] ## this is giving me NAs plot(x) lines(fitted(fit), col="red") The fitted model is not imputing the missing values. Any idea on how this is done? 回答1: fitted gives in-sample one-step forecasts. The "right" way to do what you want is via a Kalman smoother. A rough approximation good enough for most purposes