问题
I have the following table:
column1 column2
1 aaa^bbb
2 aaa^bbb|ccc^ddd
I would like to have a output file as follows:
column1 column2 column3
1 aaa bbb
2 aaa bbb
3 ccc ddd
Could you mind to let me know if there are smart way of doing this?
Update:
I was trying to do two things;
For ^, I want to separate the context to the column 2 and column 3.
For |, I want to separate it to the next row, but keeping the same number in column1 (the column1 is the same for row 2 and 3. Sorry that I make a mistake here.
To rewrite, input is as follows:
column1 column2
x aaa^bbb
y aaa^bbb|ccc^ddd
Output is as follows:
column1 column2 column3
x aaa bbb
y aaa bbb
y ccc ddd
回答1:
The easiest way to do what you are after, is just use strsplit
. For example,
> x = c("aaa^bbb", "aaa^bbb|ccc^ddd")
> ## Split the vector on ^ OR |.
> ## Since ^ and | are special characters
> ## we need to escape them: \\^ and \\|
> ## Split by column.
> new_x = unlist(strsplit(x, "\\|"))
> ## Split by row
> new_x = unlist(strsplit(new_x, "\\^"))
> new_x
[1] "aaa" "bbb" "aaa" "bbb" "ccc" "ddd"
> ## Change the vector back into a matrix
> dim(new_x) = c(2,3)
> ## Transpose to get correct shape
> t(new_x)
[,1] [,2]
[1,] "aaa" "bbb"
[2,] "aaa" "bbb"
[3,] "ccc" "ddd"
You could probably combine the splitting step, but I don't have enough knowledge to your data format to be confident that it will always work.
来源:https://stackoverflow.com/questions/5564292/replacing-and-sybmbols-in-a-matrix