发表新帖

发表新帖

create hash value for each row of data in dataframe in R

后端未结

关注

 2  1076

终归单人心

I am exploring how to compare two dataframe in R more efficiently, and I come up with hash.

My plan is to create hash for each row of data in two dataframe with same

相关标签:

2条回答

无人及你

2021-01-19 03:37
If I get what you want properly, digest will work directly with apply:
```
library(digest)
ssi.10q3.v1.hash <- data.frame(uid = 1:nrow(ssi.10q3.v1), hash = apply(ssi.10q3.v1, 1, digest))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
感情败类

2021-01-19 03:53
I know this answer doesn't match the title of the question, but if you just want to see when rows are different you can do it directly:
```
rowSums(df2 == df1) == ncol(df1)
```
Assuming both data.frames have the same dimensions, that will evaluate to FALSE for every row that is not identical. If you need to test rownames as well that could be manage seperately and combined with the test of contents, and similarly for colnames (and attributes, and strict tests on column types).
```
 rowSums(df2 == df1) == ncol(df1) & rownames(df2) == rownames(df1)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题