Consider a function Q(s,a), and we are interested in a (very simple) task, which is to find:
Importantly, we want to do so by training a neural network that o