There are two types of feedback. One is evaluative that is used in reinforcement learning method and second is instructive that is used in supervised learning mostly used for classification problems.
When supervised learning is used, the weights of the neural network are adjusted based on the information of the correct labels provided in the training dataset. So, on selecting a wrong class, the loss increases and weights are adjusted, so that for the input of that kind, this wrong class is not chosen again.
However, in reinforcement learning, the system explores all the possible actions, class labels for various inputs in this case and by evaluating the reward it decides what is right and what is wrong. It may be the case too that until it gets the correct class label it may be giving wrong class name as it is the best possible output it has found till now. So, it doesn't make use of the specific knowledge we have about the class labels, hence slows the convergence rate significantly as compared to supervised learning.
You can use reinforcement learning for classification problems but it won't be giving you any added benefit and instead slow down your convergence rate.