I just recently gotten myself into the concepts of reinforcement learning. I understand the whole gist of Q-learning and its update equation:
Q(s, a) = r + gamma * ma