Formatting timestamps to avoid R/TraMineR crash?

荒凉一梦 提交于 2021-01-27 12:03:55

问题


I have a sequence dataset where the timestamp is in seconds since the epoch:

id      event       time        end
1  723     opened 1356963741 1356963741
2  722     opened 1356931342 1356931342
3  721 referenced 1356988206 1356988206
4  721 referenced 1356988186 1356988186
5  721     closed 1356988186 1356988186
6  721     merged 1356988186 1356988186
7  721     closed 1356988186 1356988186
8  721     merged 1356988186 1356988186
9  721  discussed 1356966433 1356966433
10 721  discussed 1356963870 1356963870

I want to create an STS sequence object:

sequences.sts <- seqformat(data, from="SPELL", to="STS", 
     begin="time", end="end", id="id", status="event", limit=slmax)
sequences.sts <- seqdef(sequences.sts)
summary(sequences.sts)

However, when I do this, RStudio crashes, and more or less freeze up my entire computer. Through comparing with other code, which runs fine, that uses single-digit numbers for the "time" column, I think I have identified the problem as being the timestamp. Could it be that R/RStudio/TraMineR simply gets overloaded from the long timestamp?


回答1:


I cannot reproduce the problem, but the most probable reason is that it creates very long sequences. Sequence 721 lasts for 24'336 seconds. In other words we should create a sequence of length 24'336. Depending on the number of sequences and the other sequences, it will be very long to compute.

The problem is that we use the time unit of your timestamp (seconds). You can try to use another time unit, possibly aggregating events occuring at the same time unit.

Hope this helps.



来源:https://stackoverflow.com/questions/19683957/formatting-timestamps-to-avoid-r-traminer-crash

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!