Why Segmentation fault is happening in this openmp code?

后端 未结 2 1885
生来不讨喜
生来不讨喜 2020-11-22 16:53

main program:

program main                                                                                                                                            


        
相关标签:
2条回答
  • 2020-11-22 17:13

    The most probable cause for this behaviour is that your stack size limit is too small (for whatever reason). Since e_in is private to each OpenMP thread, one copy per thread is allocated on the thread stack (even if you have specified -heap-arrays!). 202000 elements of REAL(KIND=8) take 1616 kB (or 1579 KiB).

    The stack size limit can be controlled by several mechanisms:

    • On standard Unix system shells the amount of stack size is controlled by ulimit -s <stacksize in KiB>. This is also the stack size limit for the main OpenMP thread. The value of this limit is also used by the POSIX threads (pthreads) library as the default thread stack size when creating new threads.

    • OpenMP supports control over the stack size limit of all additional threads via the environment variable OMP_STACKSIZE. Its value is a number with an optional suffix k/K for KiB, m/M ffor MiB, or g/G for GiB. This value does not affect the stack size of the main thread.

    • The GNU OpenMP run-time (libgomp) recognises the non-standard environment variable GOMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE.

    • The Intel OpenMP run-time recognises the non-standard environment variable KMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE and also overrides the value of GOMP_STACKSIZE if the compatibility OpenMP run-time is used (which is the default as currently the only available Intel OpenMP run-time library is the compat one).

    • If none of the *_STACKSIZE variables are set, the default for Intel OpenMP run-time is 2m on 32-bit architectures and 4m on 64-bit ones.

    • On Windows, the stack size of the main thread is part of the PE header and is embedded there by the linker. If using Microsoft's LINK to do the linking, the size is specified using the /STACK:reserve[,commit]. The reserve argument specifies the maximum stack size in bytes while the optional commit argument specifies the initial commit size. Both can be specified as hexadecimal values using the 0x prefix. If re-linking the executable is not an option, the stack size could be modified by editing the PE header with EDITBIN. It takes the same stack-related argument as the linker. Programs compiled with MSVC's whole program optimisation enabled (/GL) cannot be edited.

    • The GNU linker for Win32 targets supports setting the stack size via the --stack argument. To pass the option directly from GCC, the -Wl,--stack,<size in bytes> can be used.

    Note that thread stacks are actually allocated with the size set by *_STACKSIZE (or to the default value), unlike the stack of the main thread, which starts small and then grows on demand up to the set limit. So don't set *_STACKSIZE to an arbitrary large value otherwise you may hit the process virtual memory size limit.

    Here are some examples:

    $ ifort -openmp my_module.f90 main.f90
    

    Set the main stack size limit to 1 MiB (the additional OpenMP thread would get 4 MiB as per default):

    $ ulimit -s 1024
    $ ./a.out
    zsh: segmentation fault (core dumped)  ./a.out
    

    Set the main stack size limit to 1700 KiB:

    $ ulimit -s 1700
    $ ./a.out
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
    

    Set the main stack size limit to 2 MiB and the stack size of the additional thread to 1 MiB:

    $ ulimit -s 2048
    $ KMP_STACKSIZE=1m ./a.out
    zsh: segmentation fault (core dumped)  KMP_STACKSIZE=1m ./a.out
    

    On most Unix systems the stack size limit of the main thread is set by PAM or other login mechanism (see /etc/security/limits.conf). The default on Scientific Linux 6.3 is 10 MiB.

    Another possible scenario that can lead to an error is if the virtual address space limit is set too low. For example, if the virtual address space limit is 1 GiB and the thread stack size limit is set to 512 MiB, then the OpenMP run-time would try to allocate 512 MiB for each additional thread. With two threads one would have 1 GiB for the stacks only, and when the space for code, shared libraries, heap, etc. is added up, the virtual memory size would grow beyond 1 GiB and an error would occur:

    Set the virtual address space limit to 1 GiB and run with two additional threads with 512 MiB stacks (I have commented out the call to omp_set_num_threads()):

    $ ulimit -v 1048576
    $ KMP_STACKSIZE=512m OMP_NUM_THREADS=3 ./a.out
    OMP: Error #34: System unable to allocate necessary resources for OMP thread:
    OMP: System error #11: Resource temporarily unavailable
    OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
    forrtl: error (76): Abort trap signal
    ... trace omitted ...
    zsh: abort (core dumped)  OMP_NUM_THREADS=3 KMP_STACKSIZE=512m ./a.out
    

    In this case the OpenMP run-time library would fail to create a new thread and would notify you before it aborts program termination.

    0 讨论(0)
  • 2020-11-22 17:17

    Segmentation fault is due to stack memory limit when using OpenMP. Using the solutions from the previous answer did not solve the problem for me on my Windows OS. Using memory allocation into heap rather than stack memory seems to work:

    integer, parameter :: nmax = 202000  
    real(dp), dimension(:), allocatable :: e_in
    integer i
    
    allocate(e_in(nmax))
    
    e_in = 0
    
    ! rest of code
    
    deallocate(e_in)
    

    Plus this would not involve changing any default environment parameters.

    Acknowledgement to and refer to ohm314's solution here: large array using heap memory allocation

    0 讨论(0)
提交回复
热议问题