Segmentation Faults when Running MEX Files in Parallel

烂漫一生 提交于 2019-12-24 00:54:57

问题


I am currently running repetitions of an experiment that uses MEX files in MATLAB 2012a and occasionally running into segmentation faults that I cannot understand.

Some information about the faults

  • They occur randomly

  • They only occur when I run multiple repetitions of my experiment in parallel on a Linux machine using a parfor loop.

  • They do not occur when I run multiple repetitions of my experiment in parallel on Mac OSX 10.7 using a parfor loop.

  • They do not occur when I run or do they occur when I run the repetitions sequentially.

  • They seem to occur far less frequently when I run 2 experiments in parallel - as opposed to 12 experiments in parallel.

Some information about my MEX file:

  • It is written in C

  • It uses the IBM CPLEX 12.4 API (this is thread-safe)

  • It was compiled using GCC 4.6.3

My thoughts are that there may be some issue in accessing the MEX file in multiple cores. Can anyone shed any light on what might be going on or suggest a fix? I'd be happy to provide more information as necessary.


回答1:


I've recently sent a stack trace to the people at MATLAB and it turns out that the culprit is not my code but one of the functions from the CPLEX 12.4 API. It turns out that this function uses the putenv() function in C which is not necessarily thread-safe.

Unfortunately, I have to keep using this function and the API so I've posted a follow-up thread that focuses on finding ways to avoid this fault.

Any advice would be appreciated.




回答2:


My thoughts are that there may be some issue in accessing the MEX file in multiple cores.

It's much more likely that your MEX file has a bug. Various bugs (which are very easy to make in C), such as accessing dangling memory, double-free()ing, or writing past the end of allocated array, will cause intermittent SIGSEGV.

Your best bet is to run Matlab under a debugger, and see where it crashes.



来源:https://stackoverflow.com/questions/10193451/segmentation-faults-when-running-mex-files-in-parallel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!