问题
My goal is to emulate the ARM A9 processor as found on the Zynq-7000, running baremetal software. I have tried 2 different approaches to this and run into road blocks on both. Any suggestions on how to proceed would be appreciated.
Current answers on StackOverflow:
- How to make bare metal ARM programs and run them on QEMU?
which links to
Linux Kernel Module Cheat (LKMC, using v3.0)
built using ./build --arch arm qemu-baremetal
The examples on the site of using the ARM virtual machine (-virt
flag) work fine. Trying to modify this to work with my setup is what's causing problems (details below).
- How to run a bare metal ELF file on QEMU?
I tried to copy the example command line invocation, but with the -cpu cortex-a9
option instead: qemu-system-arm: mach-virt: CPU cortex-a9 not supported
Then I changed the whole invocation to be qemu-system-arm -M xilinx-zynq-a9 -cpu cortex-a9 -nographic -kernel hello.elf -m 512M -serial mon:stdio -s -S
And it crashed with the error qemu: fatal: Trying to execute code outside RAM or ROM at 0x40000000
Which makes sense, because the application was built with the LKMC, and I was trying to run it outside of that framework.
So then I tried running my own application, which was compiled using a modified version of the Xilinx toolchain. I'm sure it won't work right away, as there will be some parts of the bootup sequence that I have to change. But I'm trying to figure out what those are and change them.
Running with qemu-system-arm -M xilinx-zynq-a9 -cpu cortex-a9 -nographic -kernel helloworld.elf -m 512M -serial mon:stdio -s -S
allows GDB to connect successfully, but it can't read the symbol table properly. Using arm-none-eabi-objdump -D helloworld.elf
tells me that main
is at 0x001004fc
, but GDB thinks it's at 0x40000324
(using the command info address main
).
My work so far
The PYNQ-Z1 (webpage, datasheet) has a 32-bit ARM Cortex-A9 processor, so that's why I'm using qemu-system-arm
instead of qemu-system-aarch64
. Someone can correct me if that's wrong.
As a note, I cannot simply switch to a different architecture; the code that I am using cannot tolerate changes other than small tweaks to make the Board Support Package (BSP) compatible with the simulator, without harming the validity of my research.
What I have been going off of for a while now is ./run --arch arm -m 512M --baremetal pynq/helloworld --wait-gdb
./run-gdb --arch arm --baremetal pynq/helloworld --no-continue -- main
and I step through using GDB to find where there are data aborts and figure out what kind of hardware is not supported by Qemu.
The software that I am running is built using a modified Xilinx toolchain, and so includes many of the Xilinx standard library functions. In modifying the code to work with the virtual machine, I have discovered a few changes so far, such as changing the address of the UART device and disabling some boot-up tasks such as invalidating the SCU or changing the cache controller configuration, presumably because these things are not emulated by Qemu.
When debugging bootup, the next problem I have run into booting up is the XTime functions (xtime_l.c). These functions are wrappers around reading the global system timer. The results of the command info mtree
in the Qemu interface seem to indicate that there is no global timer device with which to interact. Is there a way to add a timer device to the ARM virtual machine? It doesn't matter what the base address is, as long as it can be used in the same way as on the Zynq, using register reads and writes.
Then I tried to use the specific machine flag xilinx-zynq-a9
. LKMC generates the following command:
+ /home/$USER/$INSTALL_DIR/out/qemu/default/arm-softmmu/qemu-system-arm \
-machine xilinx-zynq-a9 \
-gdb tcp::45457 \
-kernel /home/$USER/$INSTALL_DIR/out/baremetal/arm/qemu/xilinx-zynq-a9/hello.elf \
-m 512M \
-monitor telnet::45454,server,nowait \
-netdev user,hostfwd=tcp::45455-:45455,hostfwd=tcp::45456-:22,id=net0 \
-no-reboot \
-smp 1 \
-virtfs local,path=/home/$USER/$INSTALL_DIR/data/9p,mount_tag=host_data,security_model=mapped,id=host_data \
-virtfs local,path=/home/$USER/$INSTALL_DIR/out,mount_tag=host_out,security_model=mapped,id=host_out \
-virtfs local,path=/home/$USER/$INSTALL_DIR/rootfs_overlay,mount_tag=host_rootfs_overlay,security_model=mapped,id=host_rootfs_overlay \
-serial mon:stdio \
-trace enable=load_file,file=/home/$USER/$INSTALL_DIR/out/run/qemu/arm/0/trace.bin \
-cpu cortex-a9 \
-device virtio-gpu-pci \
-nographic \
-serial tcp::45458,server,nowait \
-semihosting \
;
The only differences between this and the generic virtual machine are the lines that specify the machine and the cpu, which used to be -machine virt -machine highmem=off
and -cpu cortex-a15
respectively (I actually had to modify the LKMC code to get it to output the correct cpu name for the machine).
However, this fails with the error qemu-system-arm: -device rtl8139,netdev=net0: No 'PCI' bus found for device 'rtl8139'
This makes sense, because not all Zynq parts have PCI buses. So mostly I am wondering why LKMC would generate such a sequence of commands when the target is baremetal anyway.
The first option I think is the most likely to work, since it seems like the -virt
machine has better support than some of the specific targets. It is interesting that the version of Qemu that ships with the Xilinx SDK does not support baremetal with the Zynq (referred to as "standalone" in the Xilinx Docs).
Summary:
Is there a way to add a timer device to the ARM virtual machine?
Has anyone run baremetal code on Qemu Xilinx ARM A9?
I've tried to be as specific as possible, but feel free to ask clarifying questions.
回答1:
- This is probably a non-trivial task than to add support for a timer
device to an existing QEMU machine. More specifically, this may not
be needed since a fair amount of them either support an ARM
architectural timer or a specific timer hardware.
In the specific case of the xilinx-zynq-a9, it seem theGlobal Timer Counter
described from page 1448 of the Zynq-7000 Technical Reference Manual is supported. - After having reading your post a couple of times, I reached the conclusion that a lot of things may go wrong with the set of tools you are using (KMC, toolchain, QEMU). I therefore created what I hope is a Minimal, Reproducible Example of a bare-metal application working with a QEMU xilinx-zynq-a9 machine using an arm toolchain I do trust, and the latest version of QEMU, 4.2.0, built from scratch using a script I wrote.
Please note that I adapted an existing personal project I already had available, and I know is working, for the purpose of answering your question.
Building QEMU: execute build-qemu.sh
- this script works on 64 bits Ubuntu 18.04 and 19.10, you will have to set PERL_MODULES_VERSION
to 5.28
.
build-qemu.sh
:
#!/bin/bash
set -e
QEMU_VERSION=4.2.0
# xenial
PERL_MODULES_VERSION=5.22
# eoan
PERL_MODULES_VERSION=5.28
# bionic
PERL_MODULES_VERSION=5.26
PREFIX=/opt/qemu-${QEMU_VERSION}
do_install_prerequisites()
{
sudo apt-get install libglib2.0-dev libfdt-dev libpixman-1-dev zlib1g-dev libaio-dev libbluetooth-dev libbrlapi-dev libbz2-dev libcap-dev libcap-ng-dev libcurl4-gnutls-dev libgtk-3-dev libibverbs-dev \
libjpeg8-dev libncurses5-dev libnuma-dev librbd-dev librdmacm-dev libsasl2-dev libsdl2-dev libseccomp-dev libsnappy-dev libssh2-1-dev libvde-dev libvdeplug-dev libvte-2.91-dev libxen-dev liblzo2-dev \
valgrind xfslibs-dev liblzma-dev flex bison texinfo perl perl-modules-${PERL_MODULES_VERSION} python-sphinx gettext
}
do_download_qemu()
{
if [ ! -f qemu-${QEMU_VERSION}.tar.xz ]
then
wget https://download.qemu.org/qemu-${QEMU_VERSION}.tar.xz
fi
}
do_extract_qemu()
{
echo "extracting..."
rm -rf qemu-${QEMU_VERSION}
tar Jxf qemu-${QEMU_VERSION}.tar.xz
}
do_configure_qemu()
{
local TARGET_LIST="arm-softmmu"
pushd qemu-${QEMU_VERSION}
./configure --target-list="${TARGET_LIST}" --prefix=${PREFIX} --extra-cflags="-I$(pwd)/packages/include" --extra-ldflags="-L$(pwd)/packages/lib"
popd
}
do_build_qemu()
{
echo "building..."
pushd qemu-${QEMU_VERSION}
make all
popd
}
do_install_qemu()
{
echo "installing..."
pushd qemu-${QEMU_VERSION}
sudo make install
popd
}
do_build()
{
do_download_qemu
do_extract_qemu
do_configure_qemu
do_build_qemu
do_install_qemu
}
# main
do_install_prerequisites
do_build
Upon the script completion, you should have qemu-system-user 4.2.0 installed:
ls -gG /opt/qemu-4.2.0/bin/
total 22992
-rwxr-xr-x 1 22520 Feb 21 23:57 elf2dmp
-rwxr-xr-x 1 18424 Feb 21 23:57 ivshmem-client
-rwxr-xr-x 1 218264 Feb 21 23:57 ivshmem-server
-rwxr-xr-x 1 30864 Feb 21 23:57 qemu-edid
-rwxr-xr-x 1 374328 Feb 21 23:57 qemu-ga
-rwxr-xr-x 1 1767744 Feb 21 23:57 qemu-img
-rwxr-xr-x 1 1719104 Feb 21 23:57 qemu-io
-rwxr-xr-x 1 505016 Feb 21 23:57 qemu-keymap
-rwxr-xr-x 1 1727744 Feb 21 23:57 qemu-nbd
-rwxr-xr-x 1 599848 Feb 21 23:57 qemu-pr-helper
-rwxr-xr-x 1 16510840 Feb 21 23:57 qemu-system-arm
-rwxr-xr-x 1 26856 Feb 21 23:57 virtfs-proxy-helper
We now need to create the following files:
gcc_arm32_ram.ld
(adapted from the standard GCC CMSIS 5.60 linker script):
/******************************************************************************
* @file gcc_arm32.ld
* @brief GNU Linker Script for Cortex-M based device
* @version V2.0.0
* @date 21. May 2019
******************************************************************************/
/*
* Copyright (c) 2009-2019 Arm Limited. All rights reserved.
*
* SPDX-License-Identifier: Apache-2.0
*
* Licensed under the Apache License, Version 2.0 (the License); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an AS IS BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
MEMORY
{
RAM (rwx) : ORIGIN = __RAM_BASE, LENGTH = __RAM_SIZE
}
/* Linker script to place sections and symbol values. Should be used together
* with other linker script that defines memory regions FLASH and RAM.
* It references following symbols, which must be defined in code:
* Reset_Handler : Entry of reset handler
*
* It defines following symbols, which code can use without definition:
* __exidx_start
* __exidx_end
* __copy_table_start__
* __copy_table_end__
* __zero_table_start__
* __zero_table_end__
* __etext
* __data_start__
* __preinit_array_start
* __preinit_array_end
* __init_array_start
* __init_array_end
* __fini_array_start
* __fini_array_end
* __data_end__
* __bss_start__
* __bss_end__
* __end__
* end
* __HeapLimit
* __StackLimit
* __StackTop
* __stack
*/
ENTRY(Reset_Handler)
SECTIONS
{
.text :
{
KEEP(*(.vectors))
*(.text*)
KEEP(*(.init))
KEEP(*(.fini))
/* .ctors */
*crtbegin.o(.ctors)
*crtbegin?.o(.ctors)
*(EXCLUDE_FILE(*crtend?.o *crtend.o) .ctors)
*(SORT(.ctors.*))
*(.ctors)
/* .dtors */
*crtbegin.o(.dtors)
*crtbegin?.o(.dtors)
*(EXCLUDE_FILE(*crtend?.o *crtend.o) .dtors)
*(SORT(.dtors.*))
*(.dtors)
*(.rodata*)
KEEP(*(.eh_frame*))
} > RAM
/*
* SG veneers:
* All SG veneers are placed in the special output section .gnu.sgstubs. Its start address
* must be set, either with the command line option �--section-start� or in a linker script,
* to indicate where to place these veneers in memory.
*/
/*
.gnu.sgstubs :
{
. = ALIGN(32);
} > RAM
*/
.ARM.extab :
{
*(.ARM.extab* .gnu.linkonce.armextab.*)
} > RAM
__exidx_start = .;
.ARM.exidx :
{
*(.ARM.exidx* .gnu.linkonce.armexidx.*)
} > RAM
__exidx_end = .;
.copy.table :
{
. = ALIGN(4);
__copy_table_start__ = .;
LONG (__etext)
LONG (__data_start__)
LONG (__data_end__ - __data_start__)
/* Add each additional data section here */
/*
LONG (__etext2)
LONG (__data2_start__)
LONG (__data2_end__ - __data2_start__)
*/
__copy_table_end__ = .;
} > RAM
.zero.table :
{
. = ALIGN(4);
__zero_table_start__ = .;
/* Add each additional bss section here */
/*
LONG (__bss2_start__)
LONG (__bss2_end__ - __bss2_start__)
*/
__zero_table_end__ = .;
} > RAM
/**
* Location counter can end up 2byte aligned with narrow Thumb code but
* __etext is assumed by startup code to be the LMA of a section in RAM
* which must be 4byte aligned
*/
__etext = ALIGN (4);
.data : AT (__etext)
{
__data_start__ = .;
*(vtable)
*(.data)
*(.data.*)
. = ALIGN(4);
/* preinit data */
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP(*(.preinit_array))
PROVIDE_HIDDEN (__preinit_array_end = .);
. = ALIGN(4);
/* init data */
PROVIDE_HIDDEN (__init_array_start = .);
KEEP(*(SORT(.init_array.*)))
KEEP(*(.init_array))
PROVIDE_HIDDEN (__init_array_end = .);
. = ALIGN(4);
/* finit data */
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP(*(SORT(.fini_array.*)))
KEEP(*(.fini_array))
PROVIDE_HIDDEN (__fini_array_end = .);
KEEP(*(.jcr*))
. = ALIGN(4);
/* All data end */
__data_end__ = .;
} > RAM
/*
* Secondary data section, optional
*
* Remember to add each additional data section
* to the .copy.table above to asure proper
* initialization during startup.
*/
/*
__etext2 = ALIGN (4);
.data2 : AT (__etext2)
{
. = ALIGN(4);
__data2_start__ = .;
*(.data2)
*(.data2.*)
. = ALIGN(4);
__data2_end__ = .;
} > RAM2
*/
.bss :
{
. = ALIGN(4);
__bss_start__ = .;
*(.bss)
*(.bss.*)
*(COMMON)
. = ALIGN(4);
__bss_end__ = .;
} > RAM AT > RAM
/*
* Secondary bss section, optional
*
* Remember to add each additional bss section
* to the .zero.table above to asure proper
* initialization during startup.
*/
/*
.bss2 :
{
. = ALIGN(4);
__bss2_start__ = .;
*(.bss2)
*(.bss2.*)
. = ALIGN(4);
__bss2_end__ = .;
} > RAM2 AT > RAM2
*/
.heap (COPY) :
{
. = ALIGN(8);
__end__ = .;
PROVIDE(end = .);
. = . + __HEAP_SIZE;
. = ALIGN(8);
__HeapLimit = .;
} > RAM
.stack (ORIGIN(RAM) + LENGTH(RAM) - __STACK_SIZE) (COPY) :
{
. = ALIGN(8);
__StackLimit = .;
. = . + __STACK_SIZE;
. = ALIGN(8);
__StackTop = .;
} > RAM
PROVIDE(__stack = __StackTop);
/* Check if data + heap + stack exceeds RAM limit */
ASSERT(__StackLimit >= __HeapLimit, "region RAM overflowed with stack")
}
Makefile.inc
# Shared Makefile
.PHONY: clean
all: $(MACHINE).elf
$(MACHINE).elf: $(SOURCES) $(MACHINE).c
$(CC) $(CFLAGS) $(LDFLAGS) -o $(MACHINE).elf $(MACHINE).c $(SOURCES)
$(OBJDUMP) -d $(MACHINE).elf > $(MACHINE).lst
qemu: $(MACHINE).elf
$(QEMU_SYSTEM) -m 513M -nographic -machine $(MACHINE) $(QEMU_DEBUG_OPTIONS) -cpu $(CPU) -kernel $(MACHINE).elf
gdb: $(MACHINE).elf
$(GDB) --quiet --command=$(GDB_COMMANDS) $(MACHINE).elf
clean:
rm -f $(MACHINE).elf $(MACHINE).lst
startup-aarch32.s
:
.title startup-aarch32.s
.arch armv7-a
.text
.section .text.startup,"ax"
.globl Reset_Handler
Reset_Handler:
ldr r0, =__StackTop
mov sp, r0
bl start
wait: wfe
b wait
.end
xilinx-zynq-a9.c
:
#include <stdint.h>
/* Reference: https://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf - page 1449.
1. Read the upper 32-bit timer counter register
2. Read the lower 32-bit timer counter register
3. Read the upper 32-bit timer counter register again.
If the value is different to the32-bit upper value read previously, go back to step 2.
Otherwise the 64-bit timercounter value is correct.
*/
static const uintptr_t Global_Timer_Counter_Register0 = 0xF8F00200;
static const uintptr_t Global_Timer_Counter_Register1 = 0xF8F00204;
void start()
{
uint64_t global_timer_counter = 0;
uint32_t upper = 0;
uint32_t upper2 = 0;
uint32_t lower = 0;
for (;;) {
upper = *(volatile uint32_t*) Global_Timer_Counter_Register1;
lower = *(volatile uint32_t*) Global_Timer_Counter_Register0;
upper2 = *(volatile uint32_t*) Global_Timer_Counter_Register1;
if (upper != upper2) {
lower = *(volatile uint32_t*) Global_Timer_Counter_Register0;
}
global_timer_counter = (uint64_t) upper << 32 | lower;
}
}
xilinx-zynq-a9.gdb
:
target remote localhost:1234
monitor reset halt
load
break Reset_Handler
break start
xilinx-zynq-a9.ld
:
/*
*-------- <<< Use Configuration Wizard in Context Menu >>> -------------------
*/
/*--------------------- Embedded RAM Configuration ----------------------------
<h> RAM Configuration
<o0> RAM Base Address <0x0-0xFFFFFFFF:8>
<o1> RAM Size (in Bytes) <0x0-0xFFFFFFFF:8>
</h>
-----------------------------------------------------------------------------*/
__RAM_BASE = 0x00100000;
__RAM_SIZE = 0x20000000;
/*--------------------- Stack / Heap Configuration ----------------------------
<h> Stack / Heap Configuration
<o0> Stack Size (in Bytes) <0x0-0xFFFFFFFF:8>
<o1> Heap Size (in Bytes) <0x0-0xFFFFFFFF:8>
</h>
-----------------------------------------------------------------------------*/
__STACK_SIZE = 0x00020000;
__HEAP_SIZE = 0x00080000;
/*
*-------------------- <<< end of configuration section >>> -------------------
*/
INCLUDE gcc_arm32_ram.ld
I specified a 512MiB of DDR RAM area starting from 0x20000000 - see here for more information on the Zynq-7000 memory map.
xilinx-zynq-a9.mak
:
# Toolchain
CROSS_COMPILE=/opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-eabi/bin/arm-none-eabi-
CC=$(CROSS_COMPILE)gcc
OBJDUMP=$(CROSS_COMPILE)objdump
OBJCOPY=$(CROSS_COMPILE)objcopy
# Target
CPU=cortex-a9
MACHINE=xilinx-zynq-a9
CFLAGS=-O0 -ggdb -mtune=$(CPU) -nostdlib -nostartfiles -ffreestanding
LDFLAGS=-L. -Wl,-T,$(MACHINE).ld
SOURCES=startup-aarch32.s
# qemu
QEMU_DEBUG_OPTIONS=-S -gdb tcp::1234,ipv4
QEMU_SYSTEM=/opt/qemu-4.2.0/bin/qemu-system-arm
# GDB
GDB=$(CROSS_COMPILE)gdb
GDB_COMMANDS=${MACHINE}.gdb
include Makefile.inc
You now need yo install the latest GCC toolchain provided by arm:
wget "https://developer.arm.com/-/media/Files/downloads/gnu-a/9.2-2019.12/binrel/gcc-arm-9.2-2019.12-x86_64-arm-none-eabi.tar.xz?revision=64186c5d-b471-4c97-a8f5-b1b300d6594a&la=en&hash=5E9204DA5AF0B055B5B0F50C53E185FAA10FF625" -o gcc-arm-9.2-2019.12-x86_64-arm-none-eabi.tar.xz
mkdir -p /opt/arm/9
tar Jxf gcc-arm-9.2-2019.12-x86_64-arm-none-eabi.tar.xz -C /opt/arm/9
You are now ready to compile/execute/debug the example:
make -f xilinx-zynq-a9.mak clean all
You should get the following output:
rm -f xilinx-zynq-a9.elf xilinx-zynq-a9.lst
/opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-eabi/bin/arm-none-eabi-gcc -O0 -ggdb -mtune=cortex-a9 -nostdlib -nostartfiles -ffreestanding -L. -Wl,-T,xilinx-zynq-a9.ld -o xilinx-zynq-a9.elf xilinx-zynq-a9.c startup-aarch32.s
/opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-eabi/bin/arm-none-eabi-objdump -d xilinx-zynq-a9.elf > xilinx-zynq-a9.lst
You can start QEMU:
make -f xilinx-zynq-a9.mak qemu
The QEMU command that was executed should be displayed:
/opt/qemu-4.2.0/bin/qemu-system-arm -m 513M -nographic -machine xilinx-zynq-a9 -S -gdb tcp::1234,ipv4 -cpu cortex-a9 -kernel xilinx-zynq-a9.elf
In an other shell, start GDB:
make -f xilinx-zynq-a9.mak gdb
You should see the following output:
/opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-eabi/bin/arm-none-eabi-gdb --quiet --command=xilinx-zynq-a9.gdb xilinx-zynq-a9.elf
Reading symbols from xilinx-zynq-a9.elf...
Reset_Handler () at startup-aarch32.s:7
7 ldr r0, =__StackTop
unknown command: 'reset'
Loading section .text, size 0xd8 lma 0x100000
Loading section .copy.table, size 0xc lma 0x1000d8
Start address 0x1000b8, load size 228
Transfer rate: 222 KB/sec, 114 bytes/write.
Breakpoint 1 at 0x1000b8: file startup-aarch32.s, line 7.
Breakpoint 2 at 0x10000c: file xilinx-zynq-a9.c, line 17.
(gdb)
Execute continue
:
(gdb) continue
Continuing.
Breakpoint 2, start () at xilinx-zynq-a9.c:17
17 uint64_t global_timer_counter = 0;
Now, execute several step
commands, and display the global_timer_counter
variable everytime line:
31 global_timer_counter = (uint64_t) upper << 32 | lower;
is executed:
(gdb) p/x global_timer_counter
$2 = 0xa6f2a0a
(gdb) p/x global_timer_counter
$9 = 0xa84315b
(gdb) p/x global_timer_counter
$10 = 0xabe77cf
The 64 bits variable keeps incrementing ¸which is consistant with a working emulation of the Zynq Global Timer Counter by QEMU
, and we now have a working bare-metal Zynq-7000 example that can be debugged using GDB.
I hope I answered to the two questions you asked.
来源:https://stackoverflow.com/questions/60344994/linux-kernel-module-cheat-qemu-baremetal-xilinx-zynq-a9