I am encountering (to me) strange behavior when trying to evaluate the derivatives of a result obtained by sparse tensor operations. If I blow up all sparse inputs to dense befo