Is it possible to detect the CPU architecture from machine code?

后端 未结 3 1277
谎友^
谎友^ 2021-01-14 02:41

Let\'s say that there are 2 possible architectures, ARM and x86. Is there a way to detect what system the code is running on, to achieve something like this from assembly/ma

相关标签:
3条回答
  • 2021-01-14 03:09

    Assuming you have already taken care of all other differences1 and you are left with writing a small polyglot trampoline, you can use these opcodes:

    EB 02 00 EA
    

    Which, when put at address 0, for ARM (non thumb), translates into:

    00000000: b 0xbb4
    00000004: ...
    

    But for x86 (real mode), translates to:

    0000:0000 jmp 04h
    0000:0002 add dl, ch
    0000:0004 ...
    

    You can then put more elaborate x86 code at address 04h and ARM code at address 0bb4h.

    Of course, when relocating the base address, make sure to relocate the jump targets too.


    1 For example, ARM starts at address 0 while x86 starts at address 0fffffff0h, so you need a specific hardware/firmware support to abstract the boot address.

    0 讨论(0)
  • 2021-01-14 03:16

    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0363g/Beijdcef.html

    https://electronics.stackexchange.com/a/232934

    How to setup ARM interrupt vector table branches in C or inline assembly?

    http://osnet.cs.nchu.edu.tw/powpoint/Embedded94_1/Chapter%207%20ARM%20Exceptions.pdf

    ARM Undefined Instruction error

    ARM assembly is not my area of expertise, but I have programmed a lot in x86 assembly. I remember I had this same question as homework back in college. The solution I found was interrupt 06h (http://webpages.charter.net/danrollins/techhelp/0103.HTM , https://es.wikipedia.org/wiki/Llamada_de_interrupci%C3%B3n_del_BIOS#Tabla_de_interrupciones). This interrupt is fired everytime the microprocessor tries to execute an unknown instruction ("invalid opcode").

    8086 gets stucked when an invalid opcode is found, because the IP (instruction pointer) returns to the same invalid instruction, where it tries to re-execute it, this loop stucks the execution of the program.

    Starting with 80286 interrupt 06h is fired, so the programmer can handle the invalid opcode cases.

    Interrupt 06h helps to detect the CPU architecture, by simply trying to execute an x64 opcode, if interrrupt 06h is fired, the CPU did not recognize it, so it is x86, otherwise it is x64.

    This technique can be also used to detect the type of microprocessor :

    • Try to execute a 80286 instruction, if interrupt 06h is not fired, CPU is, at least, 8286.
    • Try to execute a 80386 instruction, if interrupt 06h is not fired, CPU is, at least, 8386.
    • And so on...

    http://mtech.dk/thomsen/program/ioe.php

    https://software.intel.com/en-us/articles/introduction-to-x64-assembly

    0 讨论(0)
  • 2021-01-14 03:26

    It's not possible in assembly or machine code because the machine code will depend on the architecture. So your if statement must first be compiled into either ARM or x86. If it compiled as ARM it cannot run on x86 without an emulator and if it compiled as x86 it cannot run on ARM without an emulator.

    If you do run the code in an emulator than the code is basically running in a virtual version of the CPU it was compiled for. Depending on the emulator, you may or may not be able to detect that you are running on an emulator. And depending on the emulator, if the emulator allows your code to detect that you are running on an emulator you may not be able to detect the underlying CPU and/or OS (for example, you may not be able to detect if the x86 emulator is running on x86 or ARM).

    Now, if you are very lucky, you may find two CPU architectures where the conditional branch or conditional goto instruction of one architecture does either something useful in your code or does nothing in the other architecture and vice versa. So if this is the case you can construct a binary executable that can run on two different CPU architectures.


    How multi-architecture binary works in real life.

    In real life, a multi architecture binary is actually two complete programs with shared resources (icons, images etc.) and the program binary format includes a header or preamble to tell the OS what CPUs are supported and where to find the main() function for each CPU.

    One of the best historical examples I can think of of this is the Mac OS. The Mac changed CPUs twice: first from 68k to PowerPC then from PowerPC to x86. At each stage they had to come up with a file format that contained the binary executables of two CPU architectures.


    Note on real-world executables

    Real-life programs are almost never raw binary executable. The binary code are always contained in another format that contains metadata and resources. Windows for example uses the PE format and Linux uses ELF. But some OSes support more than one type of executable container (though the actually binary machine code can be the same). For example, Linux traditionally supports ELF, COFF and ECOFF.

    0 讨论(0)
提交回复
热议问题