contingency

How to sum counts across tables that may contain partially different categories in R?

北战南征 提交于 2020-01-05 05:25:28
问题 How do I merge (add) contingency tables: > (t1 <- table(c("a","b","b","c"))) a b c 1 2 1 > (t2 <- table(c("c","d","d","a"))) a c d 1 1 2 I want this: a b c d 2 2 2 2 回答1: You can do it using split and sapply > T <- c(t1, t2) > sapply(split(T, names(T)), sum) a b c d 2 2 2 2 Or directly using tapply as pointed out by @Arun > tapply(T, names(T), sum) a b c d 2 2 2 2 回答2: Here is what I was able to come up with: > (t1 <- table(c("a","b","b","c"))) a b c 1 2 1 > (t2 <- table(c("c","d","d","a")))

How do I get a contingency table?

≯℡__Kan透↙ 提交于 2019-12-16 19:03:26
问题 I am trying to create a contingency table from a particular type of data. This would be doable with loops etc... but because my final table would contain more than 10E5 cells, I am looking for a pre-existing function. My initial data are as follow: PLANT ANIMAL INTERACTIONS ---------------------- ------------------------------- ------------ Tragopogon_pratensis Propylea_quatuordecimpunctata 1 Anthriscus_sylvestris Rhagonycha_nigriventris 3 Anthriscus_sylvestris Sarcophaga_carnaria 2 Heracleum

Find frequencies over 3rd quartile in table

帅比萌擦擦* 提交于 2019-12-12 12:33:44
问题 I have a big data frame (+239k observations on 57 variables) with some sickness descriptions and medicines administered to those sicknesses for people in different age ranges. I'd like to find those medicines in the top quartile of frequency use for each sickness description. To make a reproducible example, I created a 1000 observations data frame: set.seed(1);sk<-as.factor(sample(c("sick A","sick B","sick C","sick D"),1000,replace=T));md<-as.factor(sample(c("med 1","med 2","med 3","med 4",

How can you force inclusion of a level in a table in R?

社会主义新天地 提交于 2019-12-11 11:12:03
问题 Is there a way to force R's table function to include rows or columns even when they never occur in the data? For example, data.1 <- c(1, 2, 1, 2, 1, 2, 4) data.2 <- c(1, 4, 3, 3, 3, 1, 1) table(data.1, data.2) returns data.2 data.1 1 3 4 1 1 2 0 2 1 1 1 4 1 0 0 where there's a missing 3 in the rows and a missing 2 in the columns, because they don't appear in the data. Is there a simple way to force additional rows and columns of zeros to be inserted in the correct place, and instead return

R chi squared test (3x2 contingency table) for each row in a table

心不动则不痛 提交于 2019-12-11 10:27:12
问题 I have a dataframe, and want to perform for each row (3x2 contingency table) a chi squared test . row 1 102 4998 105 3264 105 3636 row 2 210 4890 22 3347 20 3721 row 3 ... So for the first row a chi squared test should be performed for the following contingency table; group A 102 4998 group B 105 3264 group C 105 3636 I use the following code, but this does not calculate the correct p-value (all p-values are equal to zero while this is not the case when I calculate the chi-square test myself)

How to create a binary relation matrix of pair occurrences from a list of strings?

梦想的初衷 提交于 2019-12-11 06:04:54
问题 have a list of files that contain specific genes, and I want to create a binary relation matrix in R that shows the presence of each gene in each file. For example, here are my files aaa , bbb , ccc , and ddd and the genes associated to them. aaa=c("HERC1") bbb=c("MYO9A", "PKHD1L1", "PQLC2", "SLC7A2") ccc=c("HERC1") ddd=c("MACC1","PKHD1L1") I need to generate another table that where, for each pair of genes, I assign the value 1 if both of them present in the specific file, and 0 other wise .

How to convert data frame to contingency table in R?

孤街浪徒 提交于 2019-12-07 23:41:54
问题 I have a simple question. How to convert a data frame into a contingency table for Fisher's Exact Test? I have data having about 19000 rows: head(data) R_T1 R_T2 NR_T1 NR_T2 GMNN 14 60 70 157 GORASP2 7 67 39 188 TTC34 5 69 41 186 ZXDC 8 66 37 190 ASAH2 9 65 46 181 I would like to transform each row into a contingency table to perform Fisher's Exact Test. For example, for GMNN : R NR T1 14 70 T2 60 157 fisher.test(GMNN, alternative="two.sided") Fisher's Exact Test for Count Data data: GMNN p

Creating a contingency table using multiple columns in a data frame in R

♀尐吖头ヾ 提交于 2019-12-07 03:13:19
问题 I have a data frame which looks like this: structure(list(ab = c(0, 1, 1, 1, 1, 0, 0, 0, 1, 1), bc = c(1, 1, 1, 1, 0, 0, 0, 1, 0, 1), de = c(0, 0, 1, 1, 1, 0, 1, 1, 0, 1), cl = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 2)), .Names = c("ab", "bc", "de", "cl"), row.names = c(NA, -10L), class = "data.frame") The column cl indicates a cluster association and the variables ab,bc & de carry binary answers, where 1 indicates yes and 0 - No. I am trying to create a table cross tabbing cluster along with all the