How to delay an ARM Cortex M0+ for n cycles, without a timer?

后端 未结 2 534
闹比i
闹比i 2021-01-03 04:32

I want to delay an ARM Cortex M0+ for n cycles, without using a timer, with the smallest possible code size. (I think this mandates use of assembly.)

A delay of 0 c

相关标签:
2条回答
  • 2021-01-03 05:21

    The shortest ARM loop that I can think of goes like:

    mov r0, #COUNT
    L:
    subs r0, r0, #1
    bnz L
    

    Since I don't have the device in question, no idea about timing. Those are core dependent.

    0 讨论(0)
  • 2021-01-03 05:28

    The code is going to depend on exactly what n is, and whether it needs to be dynamically variable, but given the M0+ core's instruction timings, establishing bounds for a particular routine is pretty straightforward.

    For the smallest possible (6-byte) complete loop with a fixed 8-bit immediate counter:

       movs  r0, #NUM    ;1 cycle
    1: subs  r0, r0, #1  ;1 cycle
       bne   1b          ;2 if taken, 1 otherwise
    

    with NUM=1 we get a minimum of 3 cycles, plus 3 cycles for every extra loop up to NUM=255 at 765 cycles (of course, you could have 2^32 iterations from NUM=0, but that seems a bit silly). That puts the lower bound for a loop being practical at about 6 cycles. With a fixed loop it's easy to pad NOPs (or even nested loops) inside it to lengthen each iteration, and before/after to align to a non-multiple of the loop length. If you can arrange for a number of iterations to be ready in a register before you need to start waiting, then you can lose the initial mov and have pretty much any multiple of 3 or more cycles, minus one. If you need single-cycle resolution for a variable delay, the initial setup cost is going to be somewhat higher to correct for the remainder (a computed branch into a NOP sled is what I'd do for that)

    I'm assuming that if you're at the point of cycle-critical timing you've already got interrupts off (otherwise throw in another cycle somewhere for CPSID), and that you don't have any bus wait states adding extra cycles to instruction fetches.

    As for trying to do it in C: the fact that you have to hack in an empty asm to keep the "useless" loop from being optimised away is a tip-off. The abstract C machine has no notion of "instructions" or "cycles" so there is simply no way to reliably express this in the language. Trying to rely on particular C constructs to compile to suitable instructions is extremely fragile - change a compiler flag; upgrade the compiler; change some distant code which affects register allocation which affects instruction selection; etc. - pretty much anything could change the generated code unexpectedly, so I'd say hand-coded assembly is the only sensible approach for cycle-accurate code.

    0 讨论(0)
提交回复
热议问题