I would to know how compute the optimal policy in infinite Markov decision process? and what about linear programming in MDP it use in infinite MDP ?