Getting a subset error I did not get two months ago when running logistic regression (svyglm) on survey data (SPSS dataset)

我怕爱的太早我们不能终老 提交于 2020-06-28 04:00:07

问题


I re-run script that previously worked with no errors about two months ago.

I used the haven package to upload an (non-public and proprietary) SPSS dataset and the survey package to analyze complex survey data.

Now, however, when I run even a simple logistic regression where both variables are dummies (coded 0 for no and 1 for yes)...something like this...

f <- read_sav("~/data.sav")
fsd <- svydesign(ids=~1, data=f, weights=~f$weight)
model <- svyglm(exclhlth~male,design=fsd,family=quasibinomial())

...I get the following errors:

Error: Must subset elements with a valid subscript vector.
x Subscript has the wrong type `omit`.
ℹ It must be logical, numeric, or character.
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/vctrs_error_subscript_type>
Must subset elements with a valid subscript vector.
x Subscript has the wrong type `omit`.
ℹ It must be logical, numeric, or character.
Backtrace:
 1. survey::svyglm(exclhlth ~ male, design = fsd, family = quasibinomial())
 2. survey:::svyglm.survey.design(...)
 4. survey:::`[.survey.design2`(design, -nas, )
 5. base::`[.data.frame`(x$variables, i, ..1, drop = FALSE)
 7. vctrs:::`[.vctrs_vctr`(xj, i)
 8. vctrs:::vec_index(x, i, ...)
 9. vctrs::vec_slice(x, i)
Run `rlang::last_trace()` to see the full context.

I've tried running it where I set male as a factor, and where both are set as factors. I get the same errors.

Since two months ago, I have updated R, Rstudio and both the haven and survey packages. So, I'm guessing that something changed but I am not sure what to do.

I only started transitioning from SPSS to R late last year, so I thank you in advance for any guidance and apologize in advance for newbie mistakes!


回答1:


Ok, your problem seems to be that the RStudio data import functions are creating classes that hijack the subscript ([) operation. This has happened before, when RStudio switched from creating data.frame to tbl objects, but then it was sufficient to use as.data.frame() before calling svydesign().

Until a new version of the survey package is available, can you try using foreign::read.spss instead of haven::read_sav?

(Also, if you could come up with a less-confidential example and send it to the maintainer, I'm fairly sure he'd be grateful.)

Update: the issue is that the output of na.omit has class omit, and some of the variables have class haven_labelled, and the subsetting operator for haven_labelled is very fussy about the class of its arguments: it has to be plain integer or logical, without a class.

The help for the labelled class suggests using haven::as_factor or haven::zap_labels to coerce these labelled vectors to a standard R class.

Further update: I filed a github issue for the haven package, which was moved to the vctrs package, so this behaviour is likely to be changed.



来源:https://stackoverflow.com/questions/62485531/getting-a-subset-error-i-did-not-get-two-months-ago-when-running-logistic-regres

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!