I\'m writing custom DirectShow source push filter which is supposed to receive RTP data from video server and push them to the renderer. I wrote a CVideoPushPin class which
The renderer will draw the frames when the graph's stream time reaches the timestamp on the sample object. If I read your code correctly, you are timestamping them with the stream time at arrival, so they will always be late at rendering. This is confused somewhat by the audio renderer: if the audio renderer is providing the graph's clock, then it will report the current stream time to be whatever sample it is currently playing, and that is going to cause some undesirable time behaviour.
You want to set a time in the future, to allow for the latency through the graph and any buffering in your filter. Try setting a time perhaps 300ms into the future (stream time now + 300ms).
You want to be consistent between frames, so don't timestamp them based on the arrival time of each frame. Use the RTP timestamp for each frame, and set the baseline for the first one to be 300ms into the future; subsequent frames are then (rtp - rtp_at_baseline) + dshow baseline (with appropriate unit conversions.
You need to timestamp the audio and the video streams in the same way, using the same baseline. However, if I remember, RTP timestamps have a different baseline in each stream, so you need to use the RTCP packets to convert RTP timestamps to (absolute) NTP time, and then convert NTP to directshow using your initial baseline (baseline NTP = dshow stream time now + 300ms).
G