We need to capture a live video stream from WebRTC (or any other capturing mechanism from the client webcam, even if it is not supported on all browsers, but as a PoC).
Most IP cameras these days will use H264 encoding, or MJPEG. You aren't clear about what sort of cameras are being used.
I think the real question is, what components are out there for authoring/editing video and which video format does it require. Only once you know what format you need to be in, can you transcode/transform your video as necessary so you can handle it on the server side.
There are any number of media servers to transform/transcode, and something like FFMPEG or Unreal Media Server can transform, decode, etc on server side to get it to some format you can work with. Most of the IP cameras I have seen just use an H264 web based browser player.
EDIT: Your biggest enemy is going to be your delay. 1-2 seconds of delay is going to be difficult to achieve.