问题
This may sound a very beginner's question and very well it could also be a very basic and stupid question, but somehow I am having headache in doing it.
Let's suppose I have a single item list
v <- as.list("1, 2, 3,")
v
[[1]]
[1] "1, 2, 3,"
Now I want to split all of its items as separate items
v2 <- lapply(str_split(v, pattern = ","), trimws)
v2
[[1]]
[1] "1" "2" "3" ""
Now I want to remove this ""
from the first and only item of this list without using []
?
回答1:
Using nzchar
.
lapply(v2, function(x) x[nzchar(x)])
# [[1]]
# [1] "1" "2" "3"
Or use base::strsplit
in the first place which appears to be more sophisticated.
lapply(strsplit(v[[1]], ","), trimws)
# [[1]]
# [1] "1" "2" "3"
回答2:
You can use Filter
with nchar
v2 <- lapply(str_split(v, pattern = ",\\s?"), Filter, f = nchar)
which gives
> v2
[[1]]
[1] "1" "2" "3"
回答3:
An option is also setdiff
to remove the ""
lapply(str_split(v, pattern = ",\\s*"), setdiff, "")
#[[1]]
#[1] "1" "2" "3"
回答4:
Whether you have a list
with just one item or more, I'd probably do something like:
str_split(gsub("^,|,$|\\s+", "", v), ",")
Or better yet:
strsplit(gsub("^,|,$|\\s+", "", v), ",", TRUE)
# [[1]]
# [1] "1" "2" "3"
(Or, maybe even strsplit(gsub("^,|,$|\\s", "", gsub(", ,", ",", v, fixed = TRUE)), ",", TRUE)
depending on your actual data.)
Here's an example with a list
with multiple elements as opposed to a list
with just one element.
v <- rep(v, 2500)
I've put the other answers into functions, modifying as appropriate to make them work on multiple list
elements. Here are the functions I've tested:
fun_a5a <- function() str_split(gsub("^,|,$|\\s+", "", v), ",")
fun_a5b <- function() strsplit(gsub("^,|,$|\\s+", "", v), ",", TRUE)
fun_ak <- function() lapply(str_split(v, pattern = ",\\s*"), setdiff, "")
fun_des <- function() {
v2 <- lapply(str_split(v, pattern = ","), trimws)
lapply(v2, function(x) x[x != ""])
}
fun_hfa <- function() Map(function(x){trimws(unlist(strsplit(x, ",")))}, v)
fun_hfb <- function() sapply(v, strsplit, ",\\s*")
fun_jay <- function() lapply(unlist(lapply(v, strsplit, ","), recursive = FALSE), trimws)
fun_tica <- function() lapply(str_split(v, pattern = ",\\s?"), Filter, f = nchar)
fun_ticb <- function() lapply(str_split(v, pattern = ",\\s?"), Filter, f = nzchar)
Here are the results:
bench::mark(fun_a5a(), fun_a5b(),
fun_ak(),
fun_des(),
fun_hfa(), fun_hfb(),
fun_jay(),
fun_tica(), fun_ticb())
# # A tibble: 9 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result
# <bch:expr> <bch:t> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list>
# 1 fun_a5a() 2.47ms 2.63ms 372. 58.7KB 2.04 183 1 491.4ms <list…
# 2 fun_a5b() 1.85ms 1.9ms 517. 58.7KB 2.03 255 1 493.4ms <list…
# 3 fun_ak() 14.17ms 14.85ms 66.8 58.7KB 44.5 15 10 224.5ms <list…
# 4 fun_des() 62.86ms 62.86ms 15.9 78.3KB 111. 1 7 62.9ms <list…
# 5 fun_hfa() 82.17ms 82.17ms 12.2 19.6KB 73.0 1 6 82.2ms <list…
# 6 fun_hfb() 13.36ms 13.59ms 72.9 90.8KB 9.11 32 4 438.9ms <list…
# 7 fun_jay() 71.3ms 71.3ms 14.0 58.7KB 84.2 1 6 71.3ms <list…
# 8 fun_tica() 21.97ms 22.2ms 44.5 58.7KB 66.8 8 12 179.7ms <list…
# 9 fun_ticb() 13.12ms 13.59ms 73.5 58.7KB 44.1 20 12 272.2ms <list…
# # … with 3 more variables: memory <list>, time <list>, gc <list>
ggplot::autoplot(.Last.value)
回答5:
A more ugly version (just as an additional idea, but the solution from jay.sf seems to be preferable):
Based on your v2 input:
delete <- which(v2[[1]] == "")
v2[[1]] <- v2[[1]][-delete]
# [[1]]
# [1] "1" "2" "3"
回答6:
Using Map()
:
Map(function(x){trimws(unlist(strsplit(x, ",")))}, v)
Using sapply()
:
sapply(v, strsplit, ",\\s*")
来源:https://stackoverflow.com/questions/65408309/how-to-remove-an-unnamed-element-from-a-single-item-list