问题
I have a sequence dataset where the timestamp is in seconds since the epoch:
id event time end
1 723 opened 1356963741 1356963741
2 722 opened 1356931342 1356931342
3 721 referenced 1356988206 1356988206
4 721 referenced 1356988186 1356988186
5 721 closed 1356988186 1356988186
6 721 merged 1356988186 1356988186
7 721 closed 1356988186 1356988186
8 721 merged 1356988186 1356988186
9 721 discussed 1356966433 1356966433
10 721 discussed 1356963870 1356963870
I want to create an STS
sequence object:
sequences.sts <- seqformat(data, from="SPELL", to="STS",
begin="time", end="end", id="id", status="event", limit=slmax)
sequences.sts <- seqdef(sequences.sts)
summary(sequences.sts)
However, when I do this, RStudio crashes, and more or less freeze up my entire computer. Through comparing with other code, which runs fine, that uses single-digit numbers for the "time" column, I think I have identified the problem as being the timestamp. Could it be that R/RStudio/TraMineR simply gets overloaded from the long timestamp?
回答1:
I cannot reproduce the problem, but the most probable reason is that it creates very long sequences. Sequence 721 lasts for 24'336 seconds. In other words we should create a sequence of length 24'336. Depending on the number of sequences and the other sequences, it will be very long to compute.
The problem is that we use the time unit of your timestamp (seconds). You can try to use another time unit, possibly aggregating events occuring at the same time unit.
Hope this helps.
来源:https://stackoverflow.com/questions/19683957/formatting-timestamps-to-avoid-r-traminer-crash