Let\'s say we have a NN with multiple layers. Say a simple MLP (Multi Level Perceptron) that has GEMM1 -> Activation1 -> GEMM2 -> Activation2. Now, let\'s say we ar