I am trying to use torchvision’s video classification models (R3D, R(2+1)D, MC18) [1] but my data is single channel (grey scale video), and these model uses 3 channel input,