I have come across some sample codes where set of images are added to make a QTmovie.
I am targeting this for OS X platform without any QT frameworks. I have ague idea o
If you want to write a program to do this, you could use Xuggler in Java to do it. It will allow you to save your final video in a format playable by almost any media player.
Start out by gaining an understanding of how video files (e.g. MP4, Quicktime) actually represent audio and video with this Overly Simplistic Guide to Internet Video.
Then, play around with the MediaTool tutorials. You can write programs that make raw images into video files (see this sample code). Finally, to write a program that makes audio and video that are in sync, see this tutorial; it generates a set of images, and makes some audio noise that is timed to change when a ball hits the edge of a box.
Hope that helps.
Art