I would like to estimate the amount of opcodes it takes a ARM
cortex A9
single core to handle an IRQ.
Assuming I work with Linux kernel
You question is related how to calculate the interrupt latency of Linux. At least you might be interested in how long it takes before your interrupt even starts. We will ignore this aspect of irqs
here.
A simple way is to toggle a GPIO
and use a scope to measure the interrupt. You may even toggle the GPIO
multiple times to see the time different phases take. This Window CE link shows an example measuring for latency. Some interrupt controller (such as the IMX) have I/O multiplexing modes where an interrupt number will raise/lower a particular I/O line. Alternatively, you can add code to toggle the line (see below for routines).
The source for the primary interrupt handling is in entry-armv.S. There are macros defined for the interrupt controller you use and these depend on the .config
file. For instance, there is pre-emptive interrupts, multi-interrupt controllers, SMP, etc. The primary vectors are defined at the bottom of entry-armv.S. The general gist is that the current operating mode is inspected and then either __irq_usr
or __irq_svc
is taken. These routines have a different pre-ample to store state, but they both end up calling the irq_handler
macro. The _irq_usr
has stuff about cmpxchg
, but if you specify and ARM cortex in your .config
, this won't apply. The main difference will be the possible context switch after the IRQ occurs in user mode. Your machine defines mach/entry-macro.S
which are assembler macros to access the interrupt controller and get an interrupt number. It then jumps to generic irq
handling code in the top level kernel directory.
So the second way would be to inspect the code and calculate it directly. This is probably easier if you look at the source, compile your kernel and then do an objdump --disassemble
on the vmlinux image and look for these symbols. You will see the irq_handler
macro expanded and it should jump to your IRQ code eventually.
As you can see from the source, there is also TRACE_IRQFLAGS. You can check to see if this is available on the Cortex A9 you are using with make menuconfig
(and type /TRACE_IRQFLAGS
). I don't know if it is available or not.
There are variations such as,
Measuring on a scope will show the jitter in IRQ
servicing. Examining the instructions will generally show that the IRQ
may never be serviced; for example if higher priority interrupts constantly pre-empt/prevent the IRQ
. Probably you need to do both to fully optimize for a hard deadline.
Often you don't care how long the whole IRQ
takes but the time between the IRQ
line being raised and writing/reading some peripheral register. For instance, a FIFO
may have limited depth and if the latency between the IRQ occurring and reading the FIFO
register is greater than FIFO_Size x BPS, then you have issues with the FIFO
overflowing.
The FIQ
infra-structure is a lot faster, but the kernel facilities you can use are far less!
Edit: The Cortex A9 technical reference has instruction counts in appendix B. Most ARM instruction are a single cycle on most architectures, except memory load/store, multiples and branches. Follow the 3rd and 4th paragraphs above to find the complete instruction path to handle a Linux interrupt for your configuration and just add it up; for an estimate (as the original question asks) you can just count the instructions as they are generally a single cycle.
Whilst you can calculate the theoretical minimum number of core cycles by inspection of the source code, the number actually taken is far less certain due to the effects of caching, memory and memory controller performance, what the other core is doing at the time and various other factors dependant on the micro-architecture of the ARM processor in question.
I suspect you would be better off measuring the actual interrupt latency performance of your system, either using a digital 'scope or performance counters.
Of course, for hard real-time applications, you need to know the worst case interrupt latency - which includes the worst case of all of these factors.