I have a load of 3 hour MP3 files, and every ~15 minutes a distinct 1 second sound effect is played, which signals the beginning of a new chapter.
Is it possible to
This is an Audio Event Detection problem. If the sound is always the same and there are no other sounds at the same time, it can probably be solved with a Template Matching approach. At least if there is no other sounds with other meanings that sound similar.
The simplest kind of template matching is to compute the cross-correlation between your input signal and the template.
You should probably prepare some shorter test files which have both some examples of the sound to detect as well as other typical sounds.
If the volume of the recordings is inconsistent you'll want to normalize that before running detection.
If cross-correlation in the time-domain does not work, you can compute the melspectrogram or MFCC features and cross-correlate that. If this does not yield OK results either, a machine learning model can be trained using supervised learning, but this requires labeling a bunch of data as event/not-event.