Why do assembly programs load segments (.data
/.bss
and .text
) into separate memory blocks instead of loading both the data and the code se
You can normally set attributes on a segment-by-segment basis. For example, a read-only segment lets you specify "read-only" once, and just put read-only data into that segment, rather than specifying read-only on a variable-by-variable basis.
This is not limited to assembly programs, it's how the executable format for your OS is laid out, and most OS's have decided to have a rather extensive format for executables, separating various parts of a program into sections("segments").
Separating an executable in various sections have several advantages, e.g. the ones you mention:
.bss: Stores information about memory that needs to be zeroed at program startup. Memory that needs to be zeroed is common, and an OS typically have special services for handing out zeroed memory, and if you happen to allocate a global array of 1Mb, you don't need to embed 1Mb of 0's in the executable - you can just encode that information in the .bss section, and the OS will allocate that 1Mb at program startup.
.data: This is all your data that's initialized to something other than zero at program startup.
.text: this is the actual code
There's can be many more sections, e.g. special sections containing bootstrap code that needs to run to initialize the program but can be discarded once it's been run, or sections containing debug information(that doesn't need to be loaded into memory unless you run the program in a debugger). Another common section is a readonly data section:
.rodata: contains non-writable data, e.g. all the strings or const data in your program.
Moreover, CPUs can apply protection to memory, such as readable/writable/executable memory. Having separate sections allows for easily applying these memory protection. E.g. the code needs to be executable, but having the data be executable might be a bad idea. Read only sections can also more easily be shared among other processes, the code and readonly memory sections can be shared between multiple instances of the program. If parts of the text section needed to be swapped out, they can just be discarded, as they already reside in the executable itself, whereas the data/bss sections cannot, they have to be swapped out to a special swap area.
No, I think you understood wrong, segmentation concept in not only used in assembly, every machine specific high level languages (like C, C++, Basic, Pascal, etc.) use segments internally.
Because every C and C++ programs are compiled to assembly language by GNU compiler (in the case of Ubuntu), the compiler will put appropriate data
s to appropriate segments and the GAS assembler will convert this assembly to object code bytes.