Retrieving Offsets, Strings and Virtual Address in .rodata and .rodata1

雨燕双飞 提交于 2021-01-28 03:59:55

问题


I am trying to get offsets/virtual addresses, strings in .rodata and .rodata1 sections.

For example:

#include <cstdio>

void myprintf(const char* ptr) {
        printf("%p\n", ptr);
}

int main() {
        myprintf("hello world");
        myprintf("\0\0");
        myprintf("ab\0cde");
}

Above program has .rodata per readelf -a's output:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [16] .rodata           PROGBITS         0000000000400600  00000600

And readelf -W -p .rodata gives me the offsets and the associated non null strings:

String dump of section '.rodata':
  [    10]  %p^J
  [    14]  hello world
  [    23]  ab
  [    26]  cde

I would like to write a C or C++ code to retrieve:

  1. The offsets of all the string literals (e.g. 10, 14, 23 above and the missing one for "\0\0")

  2. The string literals (e.g. "%p\n", "hello wolrd", "\0\0" above)

  3. The offset to the file for .rodata (e.g. 400600 above; is it guaranteed to be the virtual memory address? At least I see it is the case for all the string literal in my test code above.)

Because my end goal is to write a C/C++ code to read in an executable and accept user's input as the offset/virtual memory address, if the input matches the offset/virtual memory address of any string literal, then use printf() to print it out. Otherwise, ignore. (Thanks @Armali for helping me clarify)

I have read this article. I am able to access the entire string table in .rodata but not "string table indexes". The article mentions "string table indexes" but it doesn't specify how to retrieve indexes.

Hints?

Also, I wonder why there could be a section called .rodata1. According to elf manpage:

.rodata1

This section holds read-only data that typically contributes to a nonwritable segment in the process image. This section is of type SHT_PROGBITS. The attribute used is SHF_ALLOC.

It has exactly the same description as .rodata. Then, why do we have .rodata1?

Thanks!


回答1:


I am trying to get offsets, strings and virtual addresses in .rodata and .rodata1 sections.

I would like to write a C or C++ code to retrieve:

  1. The offsets of all the string literals (e.g. 10, 14, 23 above and the missing one for "\0\0")

  2. The string literals (e.g. "%p\n", "hello wolrd", "\0\0" above)

A string literal is a sequence of characters enclosed in double-quotes. We practically cannot tell what in an ELF data section is a representation of a string literal. Consider these lines added to your main():

        static const int s = '\0fg\0';
        myprintf((char *)&s);

Although there is no string literal, readelf -p .rodata … may output a line like e. g.

  [    21]  gf

So, to truly recognize representations of string literals in a data section, it would be necessary to correlate the data with source code tokens (difficult) or assembler code (perhaps easier).

it would be an issue to me that if a string literal doesn't exist in .rodata

This can easily happen. Consider:

        static char hello[] = "Hi";
        myprintf(hello);

Since the string literal is used to initialize a character array, which has to be modifiable, it can go into the .data instead of the .rodata section, as readelf -p .data … may show.

if the ELF sections contain all the valid offsets, why not using them?

The valid offsets are not collected anywhere where they can conveniently be accessed, so for practical purposes we can say ELF sections don't contain offsets/indexes to the string literals.


I am able to access the entire string table in .rodata but not "string table indexes". The article mentions "string table indexes" but it doesn't specify how to retrieve indexes.

The string table indexes are not mentioned in connection with .rodata, but with the string table section .strtab:

This section holds strings, most commonly the strings that represent the names associated with symbol table entries.




回答2:


Just a side but related question, do you know what the first 16 bytes are in .rodata? I noticed that it has 1 0x1 and 1 0x2 and then the rest is 0x0.

This is not always so; it simply depends on what read-only data the program uses. For example, if I compile your example program, the string %p\n starts at offset 4, and preceding that I also have 1 and 2 (as 16-bit words), but no zeros. Looking further what symbol might be at the start of .rodata with

> readelf -s … | grep 400738
    14: 0000000000400738     0 SECTION LOCAL  DEFAULT   14
    59: 0000000000400738     4 OBJECT  GLOBAL DEFAULT   14 _IO_stdin_used

(400738 being the .rodata start address here), I get _IO_stdin_used, a global object of size 4, which sounds like something from the standard library.



来源:https://stackoverflow.com/questions/51919876/retrieving-offsets-strings-and-virtual-address-in-rodata-and-rodata1

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!