Extract elements by name from a nested list

问题

For a named, nested list, what is the best way to extract a specific element? If I have a list with known fields (eg, from a yaml file), I want to extract an element (list or otherwise) without having to search through the names and indices or trying to keep track of the levels in the str output.

For example, I know that lm returns a nested list which contains qr info.

fit <- lm(mpg ~ wt, mtcars)
fit$qr$qraux
# [1] 1.176777 1.046354

But if I don't know the order, I just want to specify the list along with the name of the element. Ideally, something would give me both the path of indices to the element and path of names to the element and the element itself.

Related, related, related

回答1:

My recursive version 1 started to get more buggy than I first thought, so I took an easy way out and am basically grepping the captured output of utils:::print.ls_str (I think).

This has at least two disadvantages so far: captured output and eval-parse-texting, but it seems to work correctly for a very nested list such as in ggplot2::ggplotGrob.

These are just some helper functions

unname2 <- function(l) {
  ## unname all lists
  ## str(unname2(lm(mpg ~ wt, mtcars)))
  l <- unname(l)
  if (inherits(l, 'list'))
    for (ii in seq_along(l))
      l[[ii]] <- Recall(l[[ii]])
  l
}

lnames <- function(l) {
  ## extract all list names
  ## lnames(lm(mpg ~ wt, mtcars))
  nn <- lpath(l, TRUE)
  gsub('\\[.*', '', sapply(strsplit(nn, '\\$'), tail, 1))
}

lpath <- function(l, use.names = TRUE) {
  ## return all list elements with path as character string
  ## l <- lm(mpg ~ wt, mtcars); lpath(l); lpath(l, FALSE)
  ln <- deparse(substitute(l))
  # class(l) <- NULL
  l <- rapply(l, unclass, how = 'list')
  L <- capture.output(if (use.names) l else unname2(l))
  L <- L[grep('^\\$|^[[]{2,}', L)]
  paste0(ln, L)
}

And this one is returning the useful info

lextract <- function(l, what, path.only = FALSE) {
  # stopifnot(what %in% lnames(l))
  ln1 <- eval(substitute(lpath(.l, TRUE), list(.l = substitute(l))))
  ln2 <- eval(substitute(lpath(.l, FALSE), list(.l = substitute(l))))
  cat(ln1[idx <- grep(what, ln1)], sep = '\n')
  cat('\n')
  cat(ln2[idx], sep = '\n')
  cat('\n')
  if (!path.only)
    setNames(lapply(idx, function(x) eval(parse(text = ln1[x]))), ln1[idx])
  else invisible()
}

fit <- lm(mpg ~ wt, mtcars)
lextract(fit, 'qraux')
# fit$qr$qraux
# 
# fit[[7]][[2]]
# 
# [1] 1.176777 1.046354

So I can use that return value directly or now I have the indices.

fit[[7]][[2]]
# [1] 1.176777 1.046354


## etc
lextract(fit, 'qr', TRUE)

# fit$qr
# fit$qr$qr
# fit$qr$qraux
# fit$qr$pivot
# fit$qr$tol
# fit$qr$rank
# 
# fit[[7]]
# fit[[7]][[1]]
# fit[[7]][[2]]
# fit[[7]][[3]]
# fit[[7]][[4]]
# fit[[7]][[5]]

I would prefer a built-in or one-liner, however.

回答2:

Here is another recursive attempt. I'm not sure exactly about how the output should be structured, but I think this gives the sufficient information to extract the rest.

The return value here is a vector of the indices, and the length of the element. So, for the fit example, it returns c(inds=7, len=5), corresponding to the 7th position in fit and the element there is length 5.

rnames <- function(lst, item) {
  f <- function(ll, inds) {
    if ((ii <- match(item, names(ll), FALSE)))
      list(inds=c(inds, ii), len=length(ll[[ii]]))
    else if (all(is.atomic(unlist(ll, FALSE))) || !is.list(ll))
      NULL
    else
      lapply(seq_along(ll), function(i) f(ll[[i]], inds=c(inds, i)))
  }
  unlist(f(lst, NULL))
}

rnames(fit, "qr")
# inds  len 
#    7    5

This will only find the first matching item, so if a list has multiple elements with the same name, it will return the index of the first match. A slightly more nested example, where only the first "d" would be returned.

lst <- list(
  "a"=list("b"=1, "c"=2, "d"=list(1:5)), 
  "d"=list("f"=5),
  "g"=list("h"=list("i"=1:5), "k"=list(1:3, list(letters[1:4])))
)

rnames(lst, "d")
# inds  len 
#    2    1

And, when there are multiple layers of nesting

rnames(lst, "k")
# inds1 inds2   len 
#     3     2     2 

## So, that would correspond to 
lst[[3]][[2]][1:2]

来源：https://stackoverflow.com/questions/36044456/extract-elements-by-name-from-a-nested-list

标签

list

nested-lists