I need to build an architecture with machine learning that recognize musical instruments by their audio signals. The audio signals are represented as mel\'s spectrog