MHSMM package in R-Input Format?

前端 未结 1 425
南方客
南方客 2021-01-06 14:38

I\'m tring to use the MHSMM package to estimate parameters of a hidden markov model using multiple observation sequences.

But for the function hmmfit(x), what would

相关标签:
1条回答
  • 2021-01-06 15:08

    I wrote this function to create the right data format:

    formatMhsmm <- function(data){
    
      nb.sequences = nrow(data)
      nb.observations = length(data)
    
      #transform list to data frame
      data_df <- data.frame(matrix(unlist(data), nrow = nb.sequences, byrow=F))
    
    
      #iterate over these in loops
      rows <- 1:nb.sequences
      observations <- 0:(nb.observations-1)
    
      #build vector with id values
      id = numeric(length = nb.sequences*nb.observations ) 
    
      for(i in rows)
      {
        for (j in observations)
        {
          id[i+j+(i-1)*(nb.observations-1)] = i
        }
      }
    
      #build vector with observation values
      sequences = numeric(length = nb.sequences*nb.observations) 
    
      for(i in rows)
      {
        for (j in observations)
        {
          sequences[i+j+(i-1)*(nb.observations-1)] = data_df[i,j+1]
        }
      }
    
      data.df = data.frame(id, sequences)
    
      #creation of hsmm.data object needed for training
      N <- as.numeric(table(data.df$id))
      train <- list(x = data.df$sequences, N = N)
      class(train) <- "hsmm.data"
    
      return(train)
    }
    

    Basically, what you need in the hsmm.data format, is an ID that shows how long each sequence is, and the corresponding sequence. These are in a list, and then you assign the "hsmm.data" format, so that hmmfit can recognize it.

    Then you can call it like that, I gave some initial estimates for the HMM parameters, that you can adjust to your needs:

    library(mhsmm)
    
    dataset <- read.csv('file.csv',header=TRUE)
    train <- formatMhsmm(dataset)
    
    # 4 states HMM    
    J=4
    #init probabilities
    init <- rep(1/J, J)
    
    #transition matrix
    P <- matrix(rep(1/J, J*J), nrow = J)
    
    #emission matrix:  here I used a Gaussian distribution, replace muEst and sigmaEst by your initial estimates of mean and variance
    b <- list(mu = muEst, sigma = sigmaEst) 
    
    #starting model for EM
    startmodel <- hmmspec(init = init, trans = P, parms.emis = b, dens.emis = dnorm.hsmm)
    
    #EM algorithm fits an HMM to the data
    hmm <- hmmfit(train, startmodel, mstep = mstep.norm,maxit = 100)
    
    #print resulting HMM parameters
    summary(hmm)
    

    A paper where you can find some more information is: O’Connell, Jared, and Søren Højsgaard. "Hidden semi markov models for multiple observation sequences: The mhsmm package for R." Journal of Statistical Software 39.4 (2011): 1-22.

    It's a late answer, but hope it can help someone. Cheers

    0 讨论(0)
提交回复
热议问题