rolling joins data.table in R

前端 未结 1 1701
一个人的身影
一个人的身影 2020-11-30 01:47

I am trying to understand a little more about the way rolling joins work and am having some confusion, I was hoping somebody could clarify this for me. To take a concrete e

相关标签:
1条回答
  • 2020-11-30 02:13

    That quote from the documentation appears to be from FAQ 1.12 What is the difference between X[Y] and merge(X,Y). Did you find the following in ?data.table and does it help?

    roll Applies to the last join column, generally a date but can be any ordered variable, irregular and including gaps. If roll=TRUE and i's row matches to all but the last x join column, and its value in the last i join column falls in a gap (including after the last observation in x for that group), then the prevailing value in x is rolled forward. This operation is particularly fast using a modified binary search. The operation is also known as last observation carried forward (LOCF). Usually, there should be no duplicates in x's key, the last key column is a date (or time, or datetime) and all the columns of x's key are joined to. A common idiom is to select a contemporaneous regular time series (dts) across a set of identifiers (ids): DT[CJ(ids,dts),roll=TRUE] where DT has a 2-column key (id,date) and CJ stands for cross join.

    rolltolast Like roll but the data is not rolled forward past the last observation within each group defined by the join columns. The value of i must fall in a gap in x but not after the end of the data, for that group defined by all but the last join column. roll and rolltolast may not both be TRUE.

    In terms of left/right analogies to SQL joins, I prefer to think about that in the context of FAQ 2.14 Can you explain further why data.table is inspired by A[B] syntax in base. That's quite a long answer so I won't paste it here.

    0 讨论(0)
提交回复
热议问题