Create a Data Frame of Unequal Lengths

后端未结

关注

 5  1177

While data frame columns must have the same number rows, is there any way to create a data frame of unequal lengths. I\'m not interested in saving them as separate elements

相关标签:

5条回答

离开以前

2020-11-29 04:00
Another approach to the padding:
```
na.pad <- function(x,len){
    x[1:len]
}

makePaddedDataFrame <- function(l,...){
    maxlen <- max(sapply(l,length))
    data.frame(lapply(l,na.pad,len=maxlen),...)
}

x = c(rep("one",2))
y = c(rep("two",10))
z = c(rep("three",5))

makePaddedDataFrame(list(x=x,y=y,z=z))
```
The na.pad() function exploits the fact that R will automatically pad a vector with NAs if you try to index non-existent elements.

makePaddedDataFrame() just finds the longest one and pads the rest up to a matching length.
0 讨论(0)
发布评论:

提交评论
- 加载中...
感动是毒

2020-11-29 04:14

This is not possible. The closest you can get is filling the "empty" spaces with the value NA.

0 讨论(0)
发布评论:

提交评论
- 加载中...

野性不改

2020-11-29 04:17

Similar problem:

 coin <- c("Head", "Tail")
toss <- sample(coin, 50, replace=TRUE)

categorize <- function(x,len){
  count_heads <- 0
  count_tails <- 0
  tails <- as.character()
  heads <- as.character()
  for(i in 1:len){
    if(x[i] == "Head"){
      heads <- c(heads,x[i])
      count_heads <- count_heads + 1
    }else {
      tails <- c(tails,x[i])
      count_tails <- count_tails + 1
    }
  }
  if(count_heads > count_tails){
    head <- heads
    tail <- c(tails, rep(NA, (count_heads-count_tails)))
  } else {
    head <- c(heads, rep(NA,(count_tails-count_heads)))
    tail <- tails
  }
  data.frame(cbind("Heads"=head, "Tails"=tail))
}

categorize(toss,50)

Output: After the toss of the coin there will be 31 Head and 19 Tail. Then the rest of the tail will be filled with NA in order to make a data frame.

0 讨论(0)

感动是毒

2020-11-29 04:20

Sorry this isn't exactly what you asked, but I think there may be another way to get what you want.

First, if the vectors are different lengths, the data isn't really tabular, is it? How about just save it to different CSV files? You might also try ascii formats that allow storing multiple objects (json, XML).

If you feel the data really is tabular, you could pad on NAs:

> x = 1:5
> y = 1:12
> max.len = max(length(x), length(y))
> x = c(x, rep(NA, max.len - length(x)))
> y = c(y, rep(NA, max.len - length(y)))
> x
 [1]  1  2  3  4  5 NA NA NA NA NA NA NA
> y
 [1]  1  2  3  4  5  6  7  8  9 10 11 12

If you absolutely must make a data.frame with unequal columns you could subvert the check, at your own peril:

> x = 1:5
> y = 1:12
> df = list(x=x, y=y)
> attributes(df) = list(names = names(df),
    row.names=1:max(length(x), length(y)), class='data.frame')
> df
      x  y
1     1  1
2     2  2
3     3  3
4     4  4
5     5  5
6  <NA>  6
7  <NA>  7
 [ reached getOption("max.print") -- omitted 5 rows ]]
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

0 讨论(0)

忘了有多久

2020-11-29 04:21

To amplify @goodside's answer, you can do something like

L <- list(x,y,z)
cfun <- function(L) {
  pad.na <- function(x,len) {
   c(x,rep(NA,len-length(x)))
  }
  maxlen <- max(sapply(L,length))
  do.call(data.frame,lapply(L,pad.na,len=maxlen))
}

(untested).

0 讨论(0)