Here is the way I\'ve written my code. I know how to backpropagate with the chain rule and I can hardcode the math for every layer but I want to generalize it so that I can