问题
I'm sure this can be fixed with few bytes, but I've spent hours on this simple thing and can't get out of it. I don't use R often.
I have 5 asciigrid files that represent 5 raster images. Some pixels do have values, other do have NAs. For example, the first image might be something like:
NA NA NA NA NA
NA NA 2 3 NA
NA 0.2 0.3 1 NA
NA NA 4 NA NA
and the second might be:
NA NA NA NA NA
NA NA 5 1 NA
NA 0.1 12 12 NA
NA NA 6 NA NA
As you can see, NA position is always the same and I'm 100% sure about that. What I'm willing to do:
- read the files with
read.asciigrid()
; - get their values in a long array, using
values()
from theraster
package; - create a matrix with 5 rows, each of them holding the values of the corresponding map;
- linear fit each column and get the coefficients. Each column will represent a pixel, and will have 5 values corresponding to the 5 maps.
- create two new raster images with the coefficient values.
I'm stuck at lm
. Specifically, it says: Error in lm.fit(...): 0 (non-NA) cases
. However, from what I know about the imput maps, there should be either columns with all NAs or columns with no NA at all, like as follows:
NA NA NA NA 0.2 2 NA ... (lots of other columns)
NA NA NA NA 2 2.1 NA
NA NA NA NA 3 0.5 NA
NA NA NA NA 12 6 NA
NA NA NA NA 0.4 2 NA
I'd expect the output to be:
NA NA NA NA .. .. NA
so I can create a new raster image with the coefficients and keep the NA position. Where am I wrong? Pasting my code below. Thank you.
library(sp)
library(raster)
library(fields)
names = c('...','...','...','...','...')
x = c(10,20,30,40,50)
x = log(x)
y = vector('list',length=length(x))
rasters = vector('list',length=length(x))
for (name in names) {
ind = which(name == names)
rasters[ind] = read.asciigrid(name)
rasters[ind] = raster(rasters[[ind]])
y[[ind]] = values(rasters[[ind]])
}
y = t(simplify2array(y))
lModel = lm(y ~ x) // Error here!
This is the output of str(y)
:
num [1:5, 1:1260630] NA NA NA NA NA NA NA NA NA NA ... (at some point there will be numbers here)
Edit
Thanks to @RobertH I learned about raster::stack
and raster::calc
. I have tried:
x <- log(c(10,20,30,40,50))
fun <- function(y) { lm(y ~ x)$coefficients }
r <- calc(s, fun)
Getting an obscure Cannot use this function
on a .calcTest
call. I looked at raster:::.calcTest
with no success. I have tried managing the case where all y
values are NA
, like so:
fun = function(y) {
if (any(!is.na(y))) {
lm(y ~ x)$coefficients
} else {
NA
}
}
r <- calc(s,fun)
And now It works for some minutes, but then I'm getting Error in setValues(out, x) : values must be numeric, integer, logical or factor
. However it is common to set NA to raster values! I can't see what's wrong here.
回答1:
This is how you can get the raster data
library(raster)
names = c('...','...','...','...','...')
s <- stack(names)
y <- values(s)
You could now do something like this.
x <- log(c(10,20,30,40,50))
# need to exclude the rows that are all NA
i <- rowSums(is.na(y)) < ncol(y)
coef <- apply(y[i, ], 1, function(y) lm(y ~ x)$coefficients)
aa <- matrix(NA, ncol=2, nrow=length(i))
aa[i, ] <- coef
b <- brick(s, nl=2)
values(b) <- aa
But you do not need to do that. To do regression like this, I would do
fun <- function(y) { lm(y ~ x)$coefficients }
r <- calc(s, fun)
But because you have cells with only NA values (across the layers) this will fail (like in the apply above). You need to write a function to catch these cases:
funa <- function(y) {
if(all(is.na(y))) {
c(NA, NA)
} else {
lm(y ~ x)$coefficients
}
}
r <- calc(s, funa)
Or for a much faster approach
X <- cbind(1, y)
invXtX <- solve(t(X) %*% X) %*% t(X)
quickfun <- function(i) (invXtX %*% i)
m <- calc(s, quickfun)
names(m) <- c('intercept', 'slope')
See ?raster::calc
来源:https://stackoverflow.com/questions/32975210/linear-regression-on-raster-images-lm-complains-about-nas