input <- read.table(header=F, text=\"abc 2
def 3
pq 2\")
colnames(input) <- c(\"text\",\"count\")
I have input
by
should make quick work of this. Just process the input by row and then join it all up after:
output <- do.call(rbind, by(input, rownames(input), function(x) {
data.frame(text=rep(x$text, x$count), x$count)
}))
rownames(output) <- NULL
colnames(output) <- c("text","count")
print(output)
## text count
## 1 abc 2
## 2 abc 2
## 3 def 3
## 4 def 3
## 5 def 3
## 6 pq 2
## 7 pq 2
You could use rep
with(input, {
data.frame(text = rep(levels(text), count), count = rep(count, count))
})
Or, use a helper function. Both returning the following. inp
in the input data
f <- function(j, k) rep(j, k)
data.frame(text = inp$text[f(inp[,1], inp[,2])], count = f(inp[,2], inp[,2]))
# text count
#1 abc 2
#2 abc 2
#3 def 3
#4 def 3
#5 def 3
#6 pq 2
#7 pq 2
Use row indexing:
input[rep(seq(nrow(input)),input$count),]
# or even
input[rep(rownames(input),input$count),]
# text count
#1 abc 2
#1.1 abc 2
#2 def 3
#2.1 def 3
#2.2 def 3
#3 pq 2
#3.1 pq 2
The second option works because you can index by the character vector in the rownames
as well as colnames
, e.g.:
rownames(input)
#[1] "1" "2" "3"
input["1",]
# text count
#1 abc 2
Using data.table
library(data.table)
setDT(input)[, .SD[rep(1:.N, count)]]
# text count
#1: abc 2
#2: abc 2
#3: def 3
#4: def 3
#5: def 3
#6: pq 2
#7: pq 2
Or
setDT(input)[input[,rep(1:.N, count)]]
Or
as.data.frame(lapply(input, function(x) rep(x, input$count)))
# text count
# 1 abc 2
# 2 abc 2
# 3 def 3
# 4 def 3
# 5 def 3
# 6 pq 2
# 7 pq 2