I have one-dimensional intensity data which is labeled with strings of different length. Label encoding is done as follows:
from sklearn.model_selection import tr