I have a decision tree and a gradient boosting model for interval data (not binary ones!). How do I compare their performance? (python/sklear function, mathematical approach