问题
I am trying to get offsets/virtual addresses, strings in .rodata and .rodata1 sections.
For example:
#include <cstdio>
void myprintf(const char* ptr) {
printf("%p\n", ptr);
}
int main() {
myprintf("hello world");
myprintf("\0\0");
myprintf("ab\0cde");
}
Above program has .rodata per readelf -a
's output:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[16] .rodata PROGBITS 0000000000400600 00000600
And readelf -W -p .rodata
gives me the offsets and the associated non null strings:
String dump of section '.rodata':
[ 10] %p^J
[ 14] hello world
[ 23] ab
[ 26] cde
I would like to write a C or C++ code to retrieve:
The offsets of all the string literals (e.g. 10, 14, 23 above and the missing one for "\0\0")
The string literals (e.g. "%p\n", "hello wolrd", "\0\0" above)
The offset to the file for .rodata (e.g. 400600 above; is it guaranteed to be the virtual memory address? At least I see it is the case for all the string literal in my test code above.)
Because my end goal is to write a C/C++ code to read in an executable and accept user's input as the offset/virtual memory address, if the input matches the offset/virtual memory address of any string literal, then use printf()
to print it out. Otherwise, ignore. (Thanks @Armali for helping me clarify)
I have read this article. I am able to access the entire string table in .rodata
but not "string table indexes". The article mentions "string table indexes" but it doesn't specify how to retrieve indexes.
Hints?
Also, I wonder why there could be a section called .rodata1
. According to elf manpage:
.rodata1
This section holds read-only data that typically contributes to a nonwritable segment in the process image. This section is of type SHT_PROGBITS. The attribute used is SHF_ALLOC.
It has exactly the same description as .rodata
. Then, why do we have .rodata1
?
Thanks!
回答1:
I am trying to get offsets, strings and virtual addresses in .rodata and .rodata1 sections.
I would like to write a C or C++ code to retrieve:
The offsets of all the string literals (e.g. 10, 14, 23 above and the missing one for "\0\0")
The string literals (e.g. "%p\n", "hello wolrd", "\0\0" above)
A string literal is a sequence of characters enclosed in double-quotes. We practically cannot tell what in an ELF data section is a representation of a string literal. Consider these lines added to your main()
:
static const int s = '\0fg\0';
myprintf((char *)&s);
Although there is no string literal, readelf -p .rodata …
may output a line like e. g.
[ 21] gf
So, to truly recognize representations of string literals in a data section, it would be necessary to correlate the data with source code tokens (difficult) or assembler code (perhaps easier).
it would be an issue to me that if a string literal doesn't exist in
.rodata
This can easily happen. Consider:
static char hello[] = "Hi";
myprintf(hello);
Since the string literal is used to initialize a character array, which has to be modifiable, it can go into the .data
instead of the .rodata
section, as readelf -p .data …
may show.
if the ELF sections contain all the valid offsets, why not using them?
The valid offsets are not collected anywhere where they can conveniently be accessed, so for practical purposes we can say ELF sections don't contain offsets/indexes to the string literals.
I am able to access the entire string table in
.rodata
but not "string table indexes". The article mentions "string table indexes" but it doesn't specify how to retrieve indexes.
The string table indexes are not mentioned in connection with .rodata
, but with the
string table section .strtab
:
This section holds strings, most commonly the strings that represent the names associated with symbol table entries.
回答2:
Just a side but related question, do you know what the first 16 bytes are in
.rodata
? I noticed that it has 1 0x1 and 1 0x2 and then the rest is 0x0.
This is not always so; it simply depends on what read-only data the program uses. For example, if I compile your example program, the string %p\n
starts at offset 4, and preceding that I also have 1 and 2 (as 16-bit words), but no zeros. Looking further what symbol might be at the start of .rodata
with
> readelf -s … | grep 400738
14: 0000000000400738 0 SECTION LOCAL DEFAULT 14
59: 0000000000400738 4 OBJECT GLOBAL DEFAULT 14 _IO_stdin_used
(400738
being the .rodata
start address here), I get _IO_stdin_used
, a global object of size 4, which sounds like something from the standard library.
来源:https://stackoverflow.com/questions/51919876/retrieving-offsets-strings-and-virtual-address-in-rodata-and-rodata1