问题
I am writing a tool for inspecting MP4 files (aka ISO base media file format , ISO 14496 part 12).
I can interpret the majority of the boxes listed in ISO 14496-12 that are generated by OSS. I have yet to figure out how to extract individual video access units and audio access units.
I'm reasonably confident that the H.264 video in the 'mdat' box does not have the ISO 14496-10 Annex B "0x000001" prefix on the NAL units.
I have experimented with interpreting the SampleToChunkBox('stsc'), SampleSizeBox('stsz'), and ChunkOffsetBox('stco') to locate media samples inside the 'mdat', but I can't seem to find anything that I can interpret as a nal_unit() (ISO 14496-10 section 7.3.1) or a slice_header() ( section 7.3.3 ).
I am also curious where the SPS (7.3.2.1) and PPS (7.3.2.2) live. I have a suspicion these live somewhere inside the 'trak' box, but I haven't figured out where.
Pointers to applications or libraries are of limited utility. I'm writing an application, and external source code is harder to understand (being encumbered by its own framework) when compared to a mathematical explanation.
回答1:
After spending a couple of hours shoveling through other questions on stackoverflow, I eventually stumbled upon brief responses that led me to a more comprehensive answer.
Parsing H264 in mdat MP4
The encapsulation of H.264 within ISO media files is covered by ISO 14496 part 15. The SPS and PPS are stashed in the 'avcC' box documented in section 5.3.4.1.2 and 5.2.4.1.1. This box also tells you how long the length fields are when interpreting the samples.
The samples are documented in section 5.2.3 and consist of a series of NAL units prefixed by a length. An example MP4 from ffmpeg has one slice per sample, but the very first sample includes an SEI containing text documenting the version of the H.264 codec and the encoding parameters.
来源:https://stackoverflow.com/questions/8525824/mp4-iso-14496-12-how-do-you-find-the-video-and-audio-access-units