I have a dataset which contains huge amount of pictures (2 millions). A lot of pre-processing has been done and pictures are identified with id. Some of the ids do not exist