data-manipulation

How to remove columns and rows that sum to 0 while preserving non-numeric columns

你离开我真会死。 提交于 2020-04-10 04:58:30
问题 Below is a subset of my data. I am trying to remove columns AND rows that sum to 0 ... the catch is that I want to preserve columns 1 to 8 in the resulting output. Any ideas? I've tried quite a few. A tidy solution would be best. Site Date Mon Day Yr Szn SznYr A B C D E F G B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0 0 0 0 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 1 0 0 0 0 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 3 0 0 0 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0

Transpose data by groups in R

谁说我不能喝 提交于 2020-03-21 11:45:08
问题 I have data in the following structure: x <- read.table(header=T, text=" X Y D S a e 1 10 a e 2 20 a f 1 50 b c 1 40 b c 2 30 b c 3 60 b d 1 10 b d 2 20") And I want to get the following result: X Y 1 2 3 a e 10 20 a f 50 b c 40 30 60 b d 10 20 For every combination of columns X and Y I would like to transpose data in column S by order in column D . I thought xtabs() will work, but I don't think so, my best version is: xtabs(formula=S~Y+D,data=x) With result: D Y 1 2 3 c 40 30 60 d 10 20 0 e

Validation Check using controls with know genotype

本秂侑毒 提交于 2020-02-16 10:41:11
问题 Please Help!! I have the following dataframe (named Final_APOL1). I have a code that runs through Bio-Rad PCR output files to generate a final APOL1 genotype. Controls with known genotype are placed in wells E01, E02, F01, F02, G01, G02, H01, and H02. I have a separate dataframe (named Validation_controls) containing the known/correct genotypes that should be found in these well. I need a code to validate and confirm that wells in both dataframes match and a way for this to be noted for the

Validation Check using controls with know genotype

跟風遠走 提交于 2020-02-16 10:40:11
问题 Please Help!! I have the following dataframe (named Final_APOL1). I have a code that runs through Bio-Rad PCR output files to generate a final APOL1 genotype. Controls with known genotype are placed in wells E01, E02, F01, F02, G01, G02, H01, and H02. I have a separate dataframe (named Validation_controls) containing the known/correct genotypes that should be found in these well. I need a code to validate and confirm that wells in both dataframes match and a way for this to be noted for the

pandas drop consecutive duplicates selectively

六眼飞鱼酱① 提交于 2020-02-14 10:47:51
问题 I have been looking at all questions/answers about to how drop consecutive duplicates selectively in a pandas dataframe, still cannot figure out the following scenario: import pandas as pd import numpy as np def random_dates(start, end, n, freq, seed=None): if seed is not None: np.random.seed(seed) dr = pd.date_range(start, end, freq=freq) return pd.to_datetime(np.sort(np.random.choice(dr, n, replace=False))) date = random_dates('2018-01-01', '2018-01-12', 20, 'H', seed=[3, 1415]) data = {

pandas drop consecutive duplicates selectively

风流意气都作罢 提交于 2020-02-14 10:46:53
问题 I have been looking at all questions/answers about to how drop consecutive duplicates selectively in a pandas dataframe, still cannot figure out the following scenario: import pandas as pd import numpy as np def random_dates(start, end, n, freq, seed=None): if seed is not None: np.random.seed(seed) dr = pd.date_range(start, end, freq=freq) return pd.to_datetime(np.sort(np.random.choice(dr, n, replace=False))) date = random_dates('2018-01-01', '2018-01-12', 20, 'H', seed=[3, 1415]) data = {

Generating summary table at bottom of dataframe

自闭症网瘾萝莉.ら 提交于 2020-02-06 08:01:09
问题 Please Help!! I have the following dataframe (named Final_APOL1). I need to generate a summary table like the second dataframe shown. Once generated is it possible to save this as a separate output csv that will be saved to the same directory? The summary table runs through the risk allele count variables and places them into categories so population frequencies can be calculated for each mutation. Code for risk allele numbers 1, 2 or no "no APOL1 Risk Alleles" = ifelse(`Final genotype of

How to create multiple flag columns based on list values found in the dataframe column?

余生颓废 提交于 2020-02-04 05:33:21
问题 The table looks like this : ID |CITY ---------------------------------- 1 |London|Paris|Tokyo 2 |Tokyo|Barcelona|Mumbai|London 3 |Vienna|Paris|Seattle The city column contains around 1000+ values which are | delimited I want to create a flag column to indicate if a person visited only the city of interest. city_of_interest=['Paris','Seattle','Tokyo'] There are 20 such values in the list. Ouput should look like this : ID |Paris | Seattle | Tokyo ------------------------------------------- 1 |1

Tcl/Tk write in a specific line

孤街浪徒 提交于 2020-02-02 10:07:09
问题 I want to write in a specific line in Textdocument but there´s a Problem with my code, i don´t know where the bug is. set fp [open C:/Users/user/Desktop/tst/settings.txt w] set count 0 while {[gets $fp line]!=-1} { incr count if {$count==28} { break } } puts $fp "TEST" close $fp The File only contains TEST. Has anybody an idea? 回答1: You are using 'w' as access argument, which truncates the file. So you will loose all data from file while opening. Read more about open command You can use 'r+'

Using R to insert a value for missing data with a value from another data frame

南楼画角 提交于 2020-01-29 05:31:04
问题 All, I have a question that I fear might be too pedestrian to ask here, but searching for it elsewhere is leading me astray. I may not be using the right search terms. I have a panel data frame (country-year) in R with some missing values on a given variable. I'm trying to impute them with the value from another vector in another data frame. Here's an illustration of what I am trying to do. Assume Data is the data frame of interest, which has missing values on a given vector that I'm trying