问题
I am trying to use R to reproduce a sequence mining example from this post using my data. https://blog.revolutionanalytics.com/2019/02/sequential-pattern-mining-in-r.html If anyone wants to reproduce the example , here is my dataset. https://drive.google.com/file/d/1aqyldwfJm0w--E8VG5oOWHxPRMjPwapG/view?usp=sharing
THE INPUT
# Start time of data to be considered
start_month <- "2012-01-01"
# Create list of services by customer ID and CleanMonth (formatted dates)
trans_sequence <- transactions %>%
group_by(TNST, Fdate) %>%
summarize(
SIZE = n(),
TC = paste(as.character(TC), collapse = ';')
)
# Make event and sequence IDs into factors
elapsed_months <- function(end_date, start_date) {
ed <- as.POSIXlt(end_date)
sd <- as.POSIXlt(start_date)
12 * (ed$year - sd$year) + (ed$mon - sd$mon)
}
trans_sequence$eventID <- elapsed_months(trans_sequence$Fdate, start_month)
trans_sequence = trans_sequence[,c(1,5,3,4)]
names(trans_sequence) = c("sequenceID", "eventID", "SIZE", "items")
trans_sequence <- data.frame(lapply(trans_sequence, as.factor))
trans_sequence <- trans_sequence[order(trans_sequence$sequenceID, trans_sequence$eventID),]
# Convert to transaction matrix data type
write.table(trans_sequence, "mytxtout.csv", sep=";", row.names = FALSE, col.names = FALSE, quote = FALSE)
trans_matrix <- read_baskets("mytxtout.csv", sep = ";", info = c("sequenceID","eventID","SIZE"))
ERROR I get
Warning message in read_baskets("mytxtout.csv", sep = ";", info = c("sequenceID", :
“eventID not positive”
Although, the csv file is generated, I cannot actually apply SPADE because the next step requires only positive eventID s I am a rookie in R, thanks in advance
来源:https://stackoverflow.com/questions/60034239/warning-message-in-read-baskets-in-arulessequences-in-r