Intruments Call Tree broken mix of R, C++ and Fortran

≯℡__Kan透↙ 提交于 2019-12-10 16:38:00

问题


I am trying to profile a function of OpenMx, an R package containing C++ and Fortran code, for CPU time. My operating system is OS X 10.10. I have read the section regarding this topic in the R manual. This section and this post lead me to try Instruments. Here is what I did

  1. Opened Instruments
  2. Chose the Time Profiler Template
  3. Pressed Record
  4. Started my R script using RStudio

I get the following output: . The command line tool sample returns the same output.

The problem is that it looks like omxunsafedgemm_ would be called directly from the Main Thread. However, this is a low level Fortran function. It is always called by a C++ function called omxDGEMM. In this example omxDGEMM is first called by omxCallRamExpection (so almost at the bottom of the call tree). The total time of omxDGEMM is 0. Thus, the profiling information is currently useless.

In the original version of the package omxDGEMM is defined as inline. I changed this in the hope that it would resolve the issue. This was not the case. omxunsafedgemm is called by omxDGEMM like that

F77_CALL(omxunsafedgemm)(&transa, &transb,
                        &(nrow), &(ncol), &(nmid),
                        &alpha, a->data, &(a->leading), 
                        b->data, &(b->leading),&beta, result->data, &(result->leading));

Any ideas how to obtain a sensible profiler output?


回答1:


This problem is caused by the -O2 flag of the gfortran compiler, which R uses per default. The -O2 flag turns on all optimization steps that the -O1 flag enables and more (see gcc manual page 98). One of the optimization flags that the -O1 flags enables is -fomit-frame-pointer. Instruments needs the frame pointers to know the parent of a call frame (see this talk).

Thus, changing

FFLAGS = -g -O2 $(LTO) to

FFLAGS = -g -O2 -fno-omit-frame-pointer $(LTO)

in ${R_HOME}/etc/Makeconf resolves the issue. For me R_HOME=/Library//Frameworks/R.framework/Versions/3.2/Resources

Simply omitting the -O2 also solves the issue but makes OpenMx considerably slower (200 vs 30 seconds in my case).




回答2:


If the OpenMx binary came from the OpenMx website via getOpenMx.R then it would have been compiled with gcc/gfortran. If it came from CRAN it would have been compiled with the OS X compilers LLVM etc (but it would lack parallel computation because OpenMP is not compatible with LLVM). So you could try the other binary to see if the tags for profiling are better. Please let us know which version you were using and whether changing version helped.



来源:https://stackoverflow.com/questions/32378821/intruments-call-tree-broken-mix-of-r-c-and-fortran

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!