dataframe

R returning partial matching of row names

独自空忆成欢 提交于 2021-02-18 22:37:29
问题 I've run into the following issue vec <- c("a11","b21","c31") df <- data.frame(a = c(0,0,0), b = c(1,1,1), row.names = vec) df["a",] returns df["a",] a b a11 0 1 However, "a" %in% vec and "a" %in% rownames(df) both return False R is allowing for partial matching of the string when using letter followed by numbers for row names. I have replicated this on R v3.2.2 and R v3.2.1. Even df[["a",1,exact=T]] returns 0 Is there anything I can set such that R does not allow this partial matching? 回答1:

Is there an easy way to group columns in a Pandas DataFrame?

青春壹個敷衍的年華 提交于 2021-02-18 22:29:47
问题 I am trying to use Pandas to represent motion-capture data, which has T measurements of the (x, y, z) locations of each of N markers. For example, with T=3 and N=4, the raw CSV data looks like: T,Ax,Ay,Az,Bx,By,Bz,Cx,Cy,Cz,Dx,Dy,Dz 0,1,2,1,3,2,1,4,2,1,5,2,1 1,8,2,3,3,2,9,9,1,3,4,9,1 2,4,5,7,7,7,1,8,3,6,9,2,3 This is really simple to load into a DataFrame, and I've learned a few tricks that are easy (converting marker data to z-scores, or computing velocities, for example). One thing I'd like

Is there an easy way to group columns in a Pandas DataFrame?

谁说胖子不能爱 提交于 2021-02-18 22:27:08
问题 I am trying to use Pandas to represent motion-capture data, which has T measurements of the (x, y, z) locations of each of N markers. For example, with T=3 and N=4, the raw CSV data looks like: T,Ax,Ay,Az,Bx,By,Bz,Cx,Cy,Cz,Dx,Dy,Dz 0,1,2,1,3,2,1,4,2,1,5,2,1 1,8,2,3,3,2,9,9,1,3,4,9,1 2,4,5,7,7,7,1,8,3,6,9,2,3 This is really simple to load into a DataFrame, and I've learned a few tricks that are easy (converting marker data to z-scores, or computing velocities, for example). One thing I'd like

Pandas: get the min value between 2 dataframe columns

会有一股神秘感。 提交于 2021-02-18 21:11:38
问题 I have 2 columns and I want a 3rd column to be the minimum value between them. My data looks like this: A B 0 2 1 1 2 1 2 2 4 3 2 4 4 3 5 5 3 5 6 3 6 7 3 6 And I want to get a column C in the following way: A B C 0 2 1 1 1 2 1 1 2 2 4 2 3 2 4 2 4 3 5 3 5 3 5 3 6 3 6 3 7 3 6 3 Some helping code: df = pd.DataFrame({'A': [2, 2, 2, 2, 3, 3, 3, 3], 'B': [1, 1, 4, 4, 5, 5, 6, 6]}) Thanks! 回答1: Use df.min(axis=1) df['c'] = df.min(axis=1) df Out[41]: A B c 0 2 1 1 1 2 1 1 2 2 4 2 3 2 4 2 4 3 5 3 5 3

Removing rows containing specific dates in R

你说的曾经没有我的故事 提交于 2021-02-18 19:34:40
问题 Disclaimer: I am going to come out of this looking silly. I have a data frame containing a column which has a date of class POSIXct . I am trying to remove some of the rows containing specific dates- public holidays. I tried to do that using this: > modelset.nonholiday <- modelset[!modelset$date == as.POSIXct("2013-12-31")| !modelset$date ==as.POSIXct("2013-07-04") | !modelset$date == as.POSIXct("2014-07-04")| !modelset$date == as.POSIXct ("2013-11-28") | !modelset$date == as.POSIXct ("2013

Removing rows containing specific dates in R

我只是一个虾纸丫 提交于 2021-02-18 19:33:47
问题 Disclaimer: I am going to come out of this looking silly. I have a data frame containing a column which has a date of class POSIXct . I am trying to remove some of the rows containing specific dates- public holidays. I tried to do that using this: > modelset.nonholiday <- modelset[!modelset$date == as.POSIXct("2013-12-31")| !modelset$date ==as.POSIXct("2013-07-04") | !modelset$date == as.POSIXct("2014-07-04")| !modelset$date == as.POSIXct ("2013-11-28") | !modelset$date == as.POSIXct ("2013

Spark Dataframe: Select distinct rows

狂风中的少年 提交于 2021-02-18 17:00:20
问题 I tried two ways to find distinct rows from parquet but it doesn't seem to work. Attemp 1: Dataset<Row> df = sqlContext.read().parquet("location.parquet").distinct(); But throws Cannot have map type columns in DataFrame which calls set operations (intersect, except, etc.), but the type of column canvasHashes is map<string,string>;; Attemp 2: Tried running sql queries: Dataset<Row> df = sqlContext.read().parquet("location.parquet"); rawLandingDS.createOrReplaceTempView("df"); Dataset<Row>

Spark Dataframe: Select distinct rows

拜拜、爱过 提交于 2021-02-18 17:00:14
问题 I tried two ways to find distinct rows from parquet but it doesn't seem to work. Attemp 1: Dataset<Row> df = sqlContext.read().parquet("location.parquet").distinct(); But throws Cannot have map type columns in DataFrame which calls set operations (intersect, except, etc.), but the type of column canvasHashes is map<string,string>;; Attemp 2: Tried running sql queries: Dataset<Row> df = sqlContext.read().parquet("location.parquet"); rawLandingDS.createOrReplaceTempView("df"); Dataset<Row>

Filling na values with merge from another dataframe

扶醉桌前 提交于 2021-02-18 12:11:07
问题 I have a column with na values that I want to fill according to values from another data frame according to a key. I was wondering if there is any simple way to do so. Example: I have a data frame of objects and their colors like this: object color 0 chair black 1 ball yellow 2 door brown 3 ball **NaN** 4 chair white 5 chair **NaN** 6 ball grey I want to fill na values in the color column with default color from the following data frame: object default_color 0 chair brown 1 ball blue 2 door

Filling na values with merge from another dataframe

一曲冷凌霜 提交于 2021-02-18 12:10:29
问题 I have a column with na values that I want to fill according to values from another data frame according to a key. I was wondering if there is any simple way to do so. Example: I have a data frame of objects and their colors like this: object color 0 chair black 1 ball yellow 2 door brown 3 ball **NaN** 4 chair white 5 chair **NaN** 6 ball grey I want to fill na values in the color column with default color from the following data frame: object default_color 0 chair brown 1 ball blue 2 door