I have a question in the process of extracting frames from videos like ucf101 to recognize human action. In many papers, they extract frames at the same intervals in the video.<