different result by transforms and transformFunc in rxDatastep

情到浓时终转凉″ 提交于 2019-12-11 04:59:47

问题


I would like to add a new column in an xdf file. I tested both transforms and transformFunc in rxDatastep.

This line of code works fine for me:

rxDataStep(nyc_jan_xdf,transforms = list(newCol5=ifelse(payment_type==1,10,20)))

but If I use transformFunc:

CashVsCard<-function(x)
{
  if(x$payment_type==1){
    x$newCol13=10
  } else {
    x$newCol13=20
  }
  return(x)
}
rxDataStep(nyc_jan_xdf,transformFunc = CashVsCard)

it doesnt work and returns this error:

Error in doTryCatch(return(expr), name, parentenv, handler) : 
  The variable 'newCol13' has a different number of rows than other columns in the data: 1 vs. 10
In addition: Warning message:
In if (x$payment_type == 1) { :
  the condition has length > 1 and only the first element will be used

Why transformFunc doesnt work?

an example of my Data:

structure(list(VendorID = c(2L, 2L, 2L, 1L, 1L, 1L), tpep_pickup_datetime = c("2016-01-01 00:00:00", 
"2016-01-01 00:00:00", "2016-01-01 00:00:03", "2016-01-01 00:00:04", 
"2016-01-01 00:00:05", "2016-01-01 00:00:06"), tpep_dropoff_datetime = c("2016-01-01 00:00:00", 
"2016-01-01 00:00:00", "2016-01-01 00:15:49", "2016-01-01 00:14:32", 
"2016-01-01 00:14:27", "2016-01-01 00:04:44"), passenger_count = c(5L, 
1L, 6L, 1L, 2L, 1L), trip_distance = c(4.90000009536743, 10.539999961853, 
2.4300000667572, 3.70000004768372, 2.20000004768372, 1.70000004768372
), pickup_longitude = c(-73.9807815551758, -73.9845504760742, 
-73.9693298339844, -74.0043029785156, -73.9919967651367, -73.9821014404297
), pickup_latitude = c(40.7299118041992, 40.6795654296875, 40.7635383605957, 
40.7422409057617, 40.718578338623, 40.7746963500977), RatecodeID = c(1L, 
1L, 1L, 1L, 1L, 1L), store_and_fwd_flag = c("N", "N", "N", "N", 
"N", "Y"), dropoff_longitude = c(-73.9444732666016, -73.9502716064453, 
-73.9956893920898, -74.0073623657227, -74.0051345825195, -73.9709396362305
), dropoff_latitude = c(40.7166786193848, 40.7889251708984, 40.7442512512207, 
40.7069358825684, 40.7399444580078, 40.7967071533203), payment_type = c(1L, 
1L, 1L, 1L, 1L, 1L), fare_amount = c(18, 33, 12, 14, 11, 7), 
    extra = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5), mta_tax = c(0.5, 
    0.5, 0.5, 0.5, 0.5, 0.5), tip_amount = c(0, 0, 3.99000000953674, 
    3.04999995231628, 1.5, 1.64999997615814), tolls_amount = c(0, 
    0, 0, 0, 0, 0), improvement_surcharge = c(0.300000011920929, 
    0.300000011920929, 0.300000011920929, 0.300000011920929, 
    0.300000011920929, 0.300000011920929), total_amount = c(19.2999992370605, 
    34.2999992370605, 17.2900009155273, 18.3500003814697, 13.8000001907349, 
    9.94999980926514)), .Names = c("VendorID", "tpep_pickup_datetime", 
"tpep_dropoff_datetime", "passenger_count", "trip_distance", 
"pickup_longitude", "pickup_latitude", "RatecodeID", "store_and_fwd_flag", 
"dropoff_longitude", "dropoff_latitude", "payment_type", "fare_amount", 
"extra", "mta_tax", "tip_amount", "tolls_amount", "improvement_surcharge", 
"total_amount"), row.names = c(NA, 6L), class = "data.frame")

回答1:


I have found it. It is not the best solution but it works. I should only change the function like this:

CashVsCard<-function(x)
{

  p<-length(x$payment_type)   
  for(i in 1: p)
  {

    if(x$payment_type[i]==1)
    {
      x$cash_vs_Card4[i]="Card"
    }   else    {
      x$cash_vs_Card4[i]="Others"
    }
  }
  return(x)
}


来源:https://stackoverflow.com/questions/43975375/different-result-by-transforms-and-transformfunc-in-rxdatastep

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!