问题
thank you for your time. I have the following data (snippet). Its from longitudinal data, reformed to a wide-format-file of work status, each colum represents one month, each row an individual.
Code:
j1992_12 = c(1, 10, 1, 7, 1, 1)
j1993_01 = c( 1, 1, 1, NA, 3, 1)
j1993_02 = c( 1, 1, 1, NA, 3, 1)
j1993_03 = c( 1, 8, 1, NA, 3, 1)
j1993_04 = c( 1, 8, 1, NA, 3, 1)
j1993_05 = c( 1, 8, 1, NA, 3, 1)
j1993_06 = c( 1, 8, 1, NA, 3, 1)
j1993_07 = c( 1, 8, 1, NA, 3, 1)
j1993_08 = c( 1, 8, 1, NA, 3, 1)
j1993_09 = c( 1, 8, 1, NA, 3, 1)
j1993_10 = c( 1, 8, 1, NA, 3, 1)
j1993_11 = c( 1, 8, 1, NA, 3, 1)
j1993_12 = c( 1, 8, 1, NA, 3, 1)
j1994_01 = c( 1, 8, 1, 7, 3, 1)
DF93= data.frame(j1992_12, j1993_01, j1993_02, j1993_03, j1993_04, j1993_05, j1993_06, j1993_07, j1993_08, j1993_09, j1993_10, j1993_11, j1993_12, j1994_01)
Output:
j1992_12 j1993_01 j1993_02 j1993_03 j1993_04 j1993_05 j1993_06 j1993_07 j1993_08 j1993_09 j1993_10 j1993_11 j1993_12 j1994_01
R1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
R2 10 1 1 8 8 8 8 8 8 8 8 8 8 8
R3 1 1 1 1 1 1 1 1 1 1 1 1 1 1
R4 7 NA NA NA NA NA NA NA NA NA NA NA NA 7
R5 1 3 3 3 3 3 3 3 3 3 3 3 3 3
R6 1 1 1 1 1 1 1 1 1 1 1 1 1 1
My wish is to check für occurrences of 12 months straight withe "NA" as in line R4. I would like then to check if the last occurence of the year before (j1992_12) has the same value as the first occurence of the year that follows ((j1994_01). If yes I assume there was no change in work status and therefore all 12 months should get the value, that is given in the last month of the year before. If not, all should stay untouched.
Method so far:
DF93_2 = DF93
DF93_2[,2:13] <- ifelse (is.na( DF93[,2:13]) && (DF93[,1]==DF93[,14]), DF93[,1] , DF93[,2:13])
I now see, that if I try it with just a single colum like the code beneath, it replaces the whole column. How to teach R to just replace rowwise?
DF93_2[,2] <- ifelse (is.na( DF93[,2:13]) && (DF93[,1]==DF93[,14]), DF93[,1] , DF93[,2])
If someone could please give me a hint where the flaw in my understanding of R is, I would be very grateful.
EDIT! Only the original file is longitudinal, this format now is WIDE and what I need for a time series analysis. It is already cross-checked with survey data of all years (18 years, beginning 1992 going to 2010) so I would rather not retransform in into long-format an am looking for an possibility with conditions as pointed out above, that I could adjust as the condition differs.
After further testing, I think the problem lies within the search for 12 subsequent NA in a row. I just cannot find a solution to that. If you have any idea, please share. Thank you!
回答1:
EWAZ99_2[,15:26] <- ifelse ( is.na( EWAZ99[,15:26]) & (EWAZ99[,14]==EWAZ99[,27]), EWAZ99[,14] , EWAZ99[,15:26])
I think this is what you are looking for.
回答2:
Not sure if I understood your right, does something like this help?
naAction <- function(x) {
if (any(is.na(x))) {
if (x[1] == x[length(x)]) {
x[is.na(x)] <- x[1]
}
}
x
}
apply(DF93, 2, naAction)
回答3:
Here's one way:
as.data.frame(t(apply(DF93, 1, function(x)
if(x[1] == tail(x, 1) && all(is.na(head(x, -1)[-1])))
replace(x, is.na(x), x[1]) else x)))
来源:https://stackoverflow.com/questions/27298706/how-to-substitute-several-na-with-values-within-the-df-using-if-else-in-r