问题
I am trying to build a second-order Markov Chain model, now I am try to find transition matrix from the following data.
dat<-data.frame(replicate(20,sample(c("A", "B", "C","D"), size = 100, replace=TRUE)))
Now I know how to fit the first order Markov transition matrix using the function markovchainFit(dat)
in markovchain
package.
Is there any way to fit the second order transition matrix?
How do evaluate the Markov Chain models? i.e. Should I choose the first order model or second order model?
回答1:
This function should produce a Markov chain transition matrix to any lag order that you wish.
dat<-data.frame(replicate(20,sample(c("A", "B", "C","D"), size = 100, replace=TRUE)))
Markovmatrix <- function(X,l=1){
tt <- table(X[,-c((ncol(X)-l+1):ncol(X))] , c(X[,-c(1:l)]))
tt <- tt / rowSums(tt)
return(tt)
}
Markovmatrix(as.matrix(dat),1)
Markovmatrix(as.matrix(dat),2)
where l
is the lag.
e.g. 2nd order matrix, the output is:
A B C D
A 0.2422803 0.2185273 0.2446556 0.2945368
B 0.2426304 0.2108844 0.2766440 0.2698413
C 0.2146119 0.2716895 0.2123288 0.3013699
D 0.2480000 0.2560000 0.2320000 0.2640000
As for how to test what order model. There are several suggestions. One put forward by Gottman and Roy (1990) in their introductory book to Sequential Analysis is to use information value. There is a chapter on that - most of the chapter is available online.
You can also perform a likelihood-ratio chi-Square test. This is very similar to a chi square test in that you are comparing observed to expected frequencies of transitions. However, the formula is as follows:
The degrees of freedom are the square of the number of codes minus one. In your case you have 4 codes, so (4-1)^2 = 9. You can then look up the associated p-value.
I hope this helps.
来源:https://stackoverflow.com/questions/29728436/fit-and-evaluate-a-second-order-transition-matrix-markov-process-in-r