I mean how and why are realtime OSes able to meet deadlines without ever missing them? Or is this just a myth (that they do not miss deadlines)? How are they different from any
Meeting deadlines is a function of the application you write. The RTOS simply provides facilities that help you with meeting deadlines. You could also program on "bare metal" (w/o a RTOS) in a big main loop and meet you deadlines.
Also keep in mind that unlike a more general purpose OS, an RTOS has a very limited set of tasks and processes running.
Some of the facilities an RTOS provide:
Priority-based Scheduler
Most RTOS have between 32 and 256 possible priorities for individual tasks/processes. The scheduler will run the task with the highest priority. When a running task gives up the CPU, the next highest priority task runs, and so on...
The highest priority task in the system will have the CPU until:
As a developer, it is your job to assign the task priorities such that your deadlines will be met.
System Clock Interrupt routines
The RTOS will typically provide some sort of system clock (anywhere from 500 uS to 100ms) that allows you to perform time-sensitive operations. If you have a 1ms system clock, and you need to do a task every 50ms, there is usually an API that allows you to say "In 50ms, wake me up". At that point, the task would be sleeping until the RTOS wakes it up.
Note that just being woken up does not insure you will run exactly at that time. It depends on the priority. If a task with a higher priority is currently running, you could be delayed.
Deterministic Behavior
The RTOS goes to great length to ensure that whether you have 10 tasks, or 100 tasks, it does not take any longer to switch context, determine what the next highest priority task is, etc...
In general, the RTOS operation tries to be O(1).
One of the prime areas for deterministic behavior in an RTOS is the interrupt handling. When an interrupt line is signaled, the RTOS immediately switches to the correct Interrupt Service Routine and handles the interrupt without delay (regardless of the priority of any task currently running).
Note that most hardware-specific ISRs would be written by the developers on the project. The RTOS might already provide ISRs for serial ports, system clock, maybe networking hardware but anything specialized (pacemaker signals, actuators, etc...) would not be part of the RTOS.
This is a gross generalization and as with everything else, there is a large variety of RTOS implementations. Some RTOS do things differently, but the description above should be applicable to a large portion of existing RTOSes.
The textbook/interview answer is "deterministic pre-emption". The system is guaranteed to transfer control within a bounded period of time if a higher priority process is ready to run (in the ready queue) or an interrupt is asserted (typically input external to the CPU/MCU).
"Basically, you have to code each "task" in the RTOS such that they will terminate in a finite time."
This is actually correct. The RTOS will have a system tick defined by the architecture, say 10 millisec., with all tasks (threads) both designed and measured to complete within specific times. For example in processing real time audio data, where the audio sample rate is 48kHz, there is a known amount of time (in milliseconds) at which the prebuffer will become empty for any downstream task which is processing the data. Therefore using the RTOS requires correct sizing of the buffers, estimating and measuring how long this takes, and measuring the latencies between all software layers in the system. Then the deadlines can be met. Otherwise the applications will miss the deadlines. This requires analysis of the worst-case data processing throughout the entire stack, and once the worst-case is known, the system can be designed for, say, 95% processing time with 5% idle time (this processing may not ever occur in any real usage, because worst-case data processing may not be an allowed state within all layers at any single moment in time).
Example timing diagrams for the design of a real time operating system network app are in this article at EE Times, PRODUCT HOW-TO: Improving real-time voice quality in a VoIP-based telephony design http://www.eetimes.com/design/embedded/4007619/PRODUCT-HOW-TO-Improving-real-time-voice-quality-in-a-VoIP-based-telephony-design
Your RTOS is designed in such a way that it can guarantee timings for important events, like hardware interrupt handling and waking up sleeping processes exactly when they need to be.
This exact timing allows the programmer to be sure that his (say) pacemaker is going to output a pulse exactly when it needs to, not a few tens of milliseconds later because the OS was busy with another inefficient task.
It's usually a much simpler OS than a fully-fledged Linux or Windows, simply because it's easier to analyse and predict the behaviour of simple code. There is nothing stopping a fully-fledged OS like Linux being used in a RTOS environment, and it has RTOS extensions. Because of the complexity of the code base it will not be able to guarantee its timings down to as small-a scale as a smaller OS.
The RTOS scheduler is also more strict than a general purpose scheduler. It's important to know the scheduler isn't going to change your task priority because you've been running a long time and don't have any interactive users. Most OS would reduce internal the priority of this type of process to favour short-term interactive programs where the interface should not be seen to lag.
What is important is realtime applications, not realtime OS. Usually realtime applications are predictable: many tests, inspections, WCET analysis, proofs, ... have been performed which show that deadlines are met in any specified situations.
It happens that RTOSes help doing this work (building the application and verifying its RT constraints). But I've seen realtime applications running on standard Linux, relying more on hardware horsepower than on OS design.
Basically, you have to code each "task" in the RTOS such that they will terminate in a finite time.
Additionally your kernel would allocate specific amounts of time to each task, in an attempt to guarantee that certain things happened at certain times.
Note that this is not an easy task to do however. Imagine things like virtual function calls, in OO it's very difficult to determine these things. Also an RTOS must be carefully coded with regard to priority, it may require that a high priority task is given the CPU within x milliseconds, which may be difficult to do depending on how your scheduler works.