Sometimes it\'s handy to mock up something with a little C program that uses a big chunk of static memory. I noticed after changing to Fedora 15 the program took a long
I don't observe this behavior (with Debian/Sid/AMD64 on a 8Gb desktop, gcc 4.6.2, binutils gold ld (GNU Binutils for Debian 2.22) 1.11). Here is the changed program (displaying its memory map with pmap
).
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define M 1000000
#define GIANT_SIZE (2000*M)
size_t g_arr[GIANT_SIZE];
int main( int argc, char **argv){
int i;
char cmd[80];
for(i = 0; i<10; i++){
printf("This should be zero: %d\n",g_arr[i*1000]);
}
sprintf (cmd, "pmap %d", (int)getpid());
system(cmd);
exit(0);
}
Here is its compilation:
% time gcc -v -O big.c -o big
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc-4.6.real
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.2-4' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.2 (Debian 4.6.2-4)
COLLECT_GCC_OPTIONS='-v' '-O' '-o' 'big' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/4.6/cc1 -quiet -v -imultilib . -imultiarch x86_64-linux-gnu big.c -quiet -dumpbase big.c -mtune=generic -march=x86-64 -auxbase big -O -version -o /tmp/ccWThBP5.s
GNU C (Debian 4.6.2-4) version 4.6.2 (x86_64-linux-gnu)
compiled by GNU C version 4.6.2, GMP version 5.0.2, MPFR version 3.1.0, MPC version 0.9
warning: MPFR header version 3.1.0 differs from library version 3.1.0-p3.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-linux-gnu/4.6/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/4.6/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
GNU C (Debian 4.6.2-4) version 4.6.2 (x86_64-linux-gnu)
compiled by GNU C version 4.6.2, GMP version 5.0.2, MPFR version 3.1.0, MPC version 0.9
warning: MPFR header version 3.1.0 differs from library version 3.1.0-p3.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 4b128876859f8f310615c7040fa3cb67
COLLECT_GCC_OPTIONS='-v' '-O' '-o' 'big' '-mtune=generic' '-march=x86-64'
as --64 -o /tmp/ccm7905b.o /tmp/ccWThBP5.s
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-O' '-o' 'big' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/4.6/collect2 --build-id --no-add-needed --eh-frame-hdr -m elf_x86_64 --hash-style=both -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o big /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/4.6/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. /tmp/ccm7905b.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.6/crtend.o /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crtn.o
gcc -v -O big.c -o big 0.07s user 0.01s system 90% cpu 0.089 total
and its execution:
% time ./big
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
This should be zero: 0
8835: ./big
0000000000400000 4K r-x-- /home/basile/tmp/big
0000000000401000 4K rw--- /home/basile/tmp/big
0000000000402000 15625000K rw--- [ anon ]
00007f2d15a44000 1512K r-x-- /lib/x86_64-linux-gnu/libc-2.13.so
00007f2d15bbe000 2048K ----- /lib/x86_64-linux-gnu/libc-2.13.so
00007f2d15dbe000 16K r---- /lib/x86_64-linux-gnu/libc-2.13.so
00007f2d15dc2000 4K rw--- /lib/x86_64-linux-gnu/libc-2.13.so
00007f2d15dc3000 20K rw--- [ anon ]
00007f2d15dc8000 124K r-x-- /lib/x86_64-linux-gnu/ld-2.13.so
00007f2d15fb4000 12K rw--- [ anon ]
00007f2d15fe4000 12K rw--- [ anon ]
00007f2d15fe7000 4K r---- /lib/x86_64-linux-gnu/ld-2.13.so
00007f2d15fe8000 4K rw--- /lib/x86_64-linux-gnu/ld-2.13.so
00007f2d15fe9000 4K rw--- [ anon ]
00007ffff5b5b000 132K rw--- [ stack ]
00007ffff5bff000 4K r-x-- [ anon ]
ffffffffff600000 4K r-x-- [ anon ]
total 15628908K
./big 0.00s user 0.00s system 0% cpu 0.004 total
I believe that installing a recent GCC (e.g. a GCC 4.6) with a binutils Gold linker is significant for such programs.
I don't hear any swapping involved.
I am able to reproduce this on an Ubuntu 10.10 system (GNU ld (GNU Binutils for Ubuntu) 2.20.51-system.20100908
), and I think I have your answer. First, some methodology.
After confirming this happens to me in a small VM (512MB ram, 2GB swap), from here I decided the easiest thing to do would be to strace gcc and see what exactly was going on when everything went to hell:
~# strace -f gcc swap.c
It illuminated the following:
vfork() = 3589
[pid 3589] execve("/usr/lib/gcc/x86_64-linux-gnu/4.4.5/collect2", ["/usr/lib/gcc/x86_64-linux-gnu/4."..., "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "--hash-style=gnu", "-dynamic-linker", "/lib64/ld-linux-x86-64.so.2", "-o", "swap", "-z", "relro", "/usr/lib/gcc/x86_64-linux-gnu/4."..., "/usr/lib/gcc/x86_64-linux-gnu/4."..., "/usr/lib/gcc/x86_64-linux-gnu/4."..., "-L/usr/lib/gcc/x86_64-linux-gnu/"..., ...], [/* 26 vars */]) = 0
...
[pid 3589] vfork() = 3590
...
[pid 3590] execve("/usr/bin/ld", ["/usr/bin/ld", "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "--hash-style=gnu", "-dynamic-linker", "/lib64/ld-linux-x86-64.so.2", "-o", "swap", "-z", "relro", "/usr/lib/gcc/x86_64-linux-gnu/4."..., "/usr/lib/gcc/x86_64-linux-gnu/4."..., "/usr/lib/gcc/x86_64-linux-gnu/4."..., "-L/usr/lib/gcc/x86_64-linux-gnu/"..., ...], [/* 27 vars */]) = 0
...
[pid 3590] lseek(13, 4096, SEEK_SET) = 4096
[pid 3590] read(13, ".\4@\0\0\0\0\0>\4@\0\0\0\0\0N\4@\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 3590] mmap(NULL, 1600004096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1771931000
<system comes to screeching halt>
It would appear that, as we might have suspected, it looks like ld
is actually trying to anonymously mmap
the entire static memory space of this array (or possibly the entire program, it's hard to tell since the rest of the program is so small, it might all fit in that extra 4096).
So that's all well and good, but why does it work when we exceed the available swap on the system? Let's turn swapoff
and run strace -f
again...
[pid 3618] lseek(13, 4096, SEEK_SET) = 4096
[pid 3618] read(13, ".\4@\0\0\0\0\0>\4@\0\0\0\0\0N\4@\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 3618] mmap(NULL, 1600004096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid 3618] brk(0x60638000) = 0x1046000
[pid 3618] mmap(NULL, 1600135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid 3618] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fd011864000
...
Unsurprisingly, ld seems to do the same thing it tried last time, to mmap the entire space. but the system is no longer able to do that, it fails! ld tries again, and it fails again, then ld does something unexpected... it moves on with less memory.
Weird, I guess we'd better have a look at the ld code then. Drat, it doesn't do an explicit mmap
. This must be coming from inside of a plain old malloc
. We'll have to build ld with some debug symbols to track this down. Unfortunately, when I built bin-utils 2.21.1 the problem went away. Perhap it's been fixed in newer versions of bin-utils?
I tortured tested my OpenSuse 11.4 (going for 12.1 in a week)
I have 4GiB ram + 2GiB swap and did not notice serious slow down, the system might be trashing at times, but still the compile time was short.
The longest was 6 seconds while heavy swapping.
[tester@ulises ~]$ free -m
total used free shared buffers cached
Mem: 3456 3426 30 0 4 249
-/+ buffers/cache: 3172 284
Swap: 2055 1382 672
[tester@ulises ~]$ time cc -Wall -O test2.c
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’
real 0m6.501s
user 0m0.101s
sys 0m0.078s
[tester@ulises ~]$ free -m
total used free shared buffers cached
Mem: 3456 3389 67 0 5 289
-/+ buffers/cache: 3094 362
Swap: 2055 1455 599
[tester@ulises ~]$ free -m
total used free shared buffers cached
Mem: 3456 3373 82 0 4 264
-/+ buffers/cache: 3104 352
Swap: 2055 1442 612
[tester@ulises ~]$ time cc -Wall -O test2.c
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’
real 0m1.122s
user 0m0.086s
sys 0m0.045s
[tester@ulises ~]$ time cc -Wall -O test2.c
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’
real 0m0.095s
user 0m0.047s
sys 0m0.032s
[tester@ulises ~]$ free -m
total used free shared buffers cached
Mem: 3456 3376 79 0 4 252
-/+ buffers/cache: 3119 336
Swap: 2055 1436 618
[tester@ulises ~]$ time cc -Wall -O test2.c
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’
real 0m0.641s
user 0m0.054s
sys 0m0.040s
Between running I have loaded and unloaded Virtualbox Box VM's, Eclipse, large pdf files, mi firefox alone using 800+ MiB. I didi not go the limit, otherwise many Apps would be killed by the OS. It has a preference for killing Firefox.. :-)
I also went to the extreme defining:
#define M 1048576
#define GIANT_SIZE (20000*M)
and even then nothing change significantly.
[tester@ulises ~]$ time cc -Wall -O test2.c
test2.c:7:14: warning: integer overflow in expression
test2.c:7:8: error: size of array ‘g_arr’ is negative
test2.c:7:1: warning: variably modified ‘g_arr’ at file scope
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘size_t’
real 0m0.661s
user 0m0.043s
sys 0m0.031s
Edit: I re-tested using Fedora16 on a VM with 512MiB RAM and 1.5GiB swap, and things were similar except for an error message on my "maximum stress version" where 20000 megabytes were assigned to the array. The error say the array size was negative.
[ricardo@localhost ~]$ time gcc -Wall test2.c
test2.c:7:14: warning: integer overflow in expression [-Woverflow]
test2.c:7:8: error: size of array ‘g_arr’ is negative
test2.c:7:1: warning: variably modified ‘g_arr’ at file scope [enabled by default]
test2.c: In function ‘main’:
test2.c:13:2: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘size_t’ [-Wformat]
real 0m1.053s
user 0m0.050s
sys 0m0.137s
The same response happens in opensuse 12.1 VM. The Fedora 16 install seamed verry slow and memory hungry(during install I had to use 800MiB versus OpenSuse 512 MiB), I could not use swapoff on Fedora because it was using a lot of swap space. I had not sluggishness nor memory problems on OpenSuse 12.1 and . Both have essentially the same versions of kernel, gcc, etc. Both using stock installs with KDE as the Desktop environment
I could not reproduce you issues, Maybe is a gcc related issue. Try downloading an older version like 4.5 and see what happens