Extract the labels attribute from “labeled” tibble columns from a haven import from Stata

后端 未结 3 1240
长发绾君心
长发绾君心 2021-02-02 16:24

Hadley Wickham\'s haven package, applied to a Stata file, returns a tibble with many columns of type \"labeled\". You can see these with str(), e.g.:



        
3条回答
  •  面向向阳花
    2021-02-02 16:55

    I'm going to take a go at answering this one, though my code isn't very pretty.

    First I make a function to extract a named attribute from a single column.

    ColAttr <- function(x, attrC, ifIsNull) {
    # Returns column attribute named in attrC, if present, else isNullC.
      atr <- attr(x, attrC, exact = TRUE)
      atr <- if (is.null(atr)) {ifIsNull} else {atr}
      atr
    }
    

    Then a function to lapply it to all the columns:

    AtribLst <- function(df, attrC, isNullC){
    # Returns list of values of the col attribute attrC, if present, else isNullC
      lapply(df, ColAttr, attrC=attrC, ifIsNull=isNullC)
    }
    

    Finally I run it for each attribute.

    stub93 <- AtribLst(cps_00093.df, attrC="label", isNullC=NA)
    
    labels93 <- AtribLst(cps_00093.df, attrC="labels", isNullC=NA)
    labels93 <- labels93[!is.na(labels93)]
    

    All the columns have a "label" attribute, but only some are of type "labeled" and so have a "labels" attribute. The labels attribute is named, where the labels match values of the data and the names tell you what those values signify.

提交回复
热议问题