dataframe

Extract value of a particular column name in pandas as listed in another column

人盡茶涼 提交于 2021-02-16 15:08:57
问题 The title wasn't too clear but here's an example. Suppose I have: person apple orange type Alice 11 23 apple Bob 14 20 orange and I want to get this column person new_col Alice 11 Bob 20 so we get the column 'apple' for row 'Alice' and 'orange' for row 'Bob'. I'm thinking iterrows, but that would be slow. Are there faster ways to do this? 回答1: Use DataFrame.lookup: df['new_col'] = df.lookup(df.index, df['type']) print (df) person apple orange type new_col 0 Alice 11 23 apple 11 1 Bob 14 20

How to reorder a data frame based on mean of column groups

末鹿安然 提交于 2021-02-16 15:08:42
问题 I am trying to reorder a data frame based on the median value associated with a column ID. I have a dataframe with a column of IDs and 2 columns of values. ID <- c("a","a","a","b","b","b","c","c","c","c") alpha <- c(3,4,5,9,11,13,1,1,1,0) beta <- c(2,3,4,3,4,5,4,5,6,7) df <- data.frame(ID,alpha,beta) ID alpha beta 1 a 3 2 2 a 4 3 3 a 5 4 4 b 9 3 5 b 11 4 6 b 13 5 7 c 1 4 8 c 1 5 9 c 1 6 10 c 0 7 I want to reorder this dataframe so that the column ID is in an order based on the decending means

Extract value of a particular column name in pandas as listed in another column

て烟熏妆下的殇ゞ 提交于 2021-02-16 15:08:22
问题 The title wasn't too clear but here's an example. Suppose I have: person apple orange type Alice 11 23 apple Bob 14 20 orange and I want to get this column person new_col Alice 11 Bob 20 so we get the column 'apple' for row 'Alice' and 'orange' for row 'Bob'. I'm thinking iterrows, but that would be slow. Are there faster ways to do this? 回答1: Use DataFrame.lookup: df['new_col'] = df.lookup(df.index, df['type']) print (df) person apple orange type new_col 0 Alice 11 23 apple 11 1 Bob 14 20

How to reorder a data frame based on mean of column groups

大城市里の小女人 提交于 2021-02-16 15:07:31
问题 I am trying to reorder a data frame based on the median value associated with a column ID. I have a dataframe with a column of IDs and 2 columns of values. ID <- c("a","a","a","b","b","b","c","c","c","c") alpha <- c(3,4,5,9,11,13,1,1,1,0) beta <- c(2,3,4,3,4,5,4,5,6,7) df <- data.frame(ID,alpha,beta) ID alpha beta 1 a 3 2 2 a 4 3 3 a 5 4 4 b 9 3 5 b 11 4 6 b 13 5 7 c 1 4 8 c 1 5 9 c 1 6 10 c 0 7 I want to reorder this dataframe so that the column ID is in an order based on the decending means

add column to my data frame listing columns with the highest row value

前提是你 提交于 2021-02-16 14:54:12
问题 trying tell r to read through the rows of my dataframe and add the column with the highest value in the row to a new column in the dataframe called "MOST_COMMON_CANCER" I tried the following code but got an error. BASE_DF2 <- BASE_DF2%>%mutate(MOST_COMMON_CANCER=colnames(BASE_DF2[8:26])[max.col(BASE_DF2[8:26],ties.method="first")],.keep="all",.after=c_INCS_RATE) Error: Problem with `mutate()` input `MOST_COMMON_CANCER`. x Input `MOST_COMMON_CANCER` can't be recycled to size 1. i Input `MOST

How to compare two dataframes ignoring column names?

孤街醉人 提交于 2021-02-16 14:35:08
问题 Suppose I want to compare the content of two dataframes, but not the column names (or index names). Is it possible to achieve this without renaming the columns? For example: df = pd.DataFrame({'A': [1,2], 'B':[3,4]}) df_equal = pd.DataFrame({'a': [1,2], 'b':[3,4]}) df_diff = pd.DataFrame({'A': [1,2], 'B':[3,5]}) In this case, df is df_equal but different to df_diff , because the values in df_equal has the same content, but the ones in df_diff . Notice that the column names in df_equal are

Transpose dataframe based on column list

时光总嘲笑我的痴心妄想 提交于 2021-02-16 10:06:15
问题 I have a dataframe in the following structure: cNames | cValues | number [a,b,c] | [1,2,3] | 10 [a,b,d] | [55,66,77]| 20 I would like to transpose - create columns from the names in cNames . But I can't manage to achieve this with transpose because I want a column for each value in the list. The needed output: a | b | c | d | number 1 | 2 | 3 | NaN | 10 55 | 66 | NaN | 77 | 20 How can I achieve this result? Thanks! The code to create the DF: d = {'cNames': [['a','b','c'], ['a','b','d']],

Find uniqueness in data frame withe rows NA?

柔情痞子 提交于 2021-02-16 08:54:51
问题 I have a data frame like below. I would like to find unique rows (uniqueness). But in this data I have 'NA'. I like if all value in one row with NA value is the same with other rows (like rows: 1,2,5) I want to ignore it, but if not same (like rows : 2,4) I like to keep it as unique row. For example, in rows 1 ,2 and 6 all values except NA are the same so because NA can be value '1 and 3' I like to remove this row and just keep row 2. Also, in row 6 values 2 and 3 (exclude NA) are the same as

Transpose the data in a column every nth rows in PANDAS

心不动则不痛 提交于 2021-02-16 07:52:31
问题 For a research project, I need to process every individual's information from the website into an excel file. I have copied and pasted everything I need from the website onto a single column in an excel file, and I loaded that file using PANDAS. However, I need to present each individual's information horizontally instead of vertically like it is now. For example, this is what I have right now. I only have one column of unorganized data. df= pd.read_csv("ior work.csv", encoding = "ISO-8859-1"

Transpose the data in a column every nth rows in PANDAS

半城伤御伤魂 提交于 2021-02-16 07:49:21
问题 For a research project, I need to process every individual's information from the website into an excel file. I have copied and pasted everything I need from the website onto a single column in an excel file, and I loaded that file using PANDAS. However, I need to present each individual's information horizontally instead of vertically like it is now. For example, this is what I have right now. I only have one column of unorganized data. df= pd.read_csv("ior work.csv", encoding = "ISO-8859-1"