Interpretation of memory profiling output of `Rprof`

I am trying use profiling to see which part of my code is reposponsible for the maximum usage of 3GB of memory (as reported by gc() statistic on maximum used memory, see here how). I am running memory profiling like this:

Rprof(line.profiling = TRUE, memory.profiling = TRUE)
graf(...) # ... here I run the profiled code
Rprof(NULL)
summaryRprof(lines = "both", memory = "both")

And the output is the following:

$by.total
                       total.time total.pct mem.total self.time self.pct
"graf"                     299.12     99.69   50814.4      0.02     0.01
#2                         299.12     99.69   50814.4      0.00     0.00
"graf.fit.laplace"         299.06     99.67   50787.2      0.00     0.00
"doTryCatch"               103.42     34.47    4339.2      0.00     0.00
"chol"                     103.42     34.47    4339.2      0.00     0.00
"tryCatch"                 103.42     34.47    4339.2      0.00     0.00
"tryCatchList"             103.42     34.47    4339.2      0.00     0.00
"tryCatchOne"              103.42     34.47    4339.2      0.00     0.00
"chol.default"             101.62     33.87    1087.0    101.62    33.87
graf.fit.laplace.R#46       85.80     28.60    3633.2      0.00     0.00
"backsolve"                 78.82     26.27    1635.2     58.40    19.46

How shall I interpret mem.total? What is it and what is the unit of it? I tried to look at the documentation, namely ?Rprof and ?summaryRprof, but it seems it's not well documented :-/

EDIT: Here they say that Rprof "probes the total memory usage of R at regular time intervals". But that doesn't fit to the 50GB which are way beyond what my memory is able to fit! (8GB physical + 12 GB pagefile now).

Similarly, as pointed out by R Yoda, ?summaryRprof says that with memory = "both" it means "the change in total memory". But what is it exactly (is it a total memory or change of total memory), and how does it fit with the 50GB number?

EDIT: the same analysis done in profvis - when i hover ovew the 50812, it shows "Memory allocation (MB)", and hover over the black bar close to that vertical line "Percentage of peak memory allocation and deallocation". Not sure what that means... This is like 50 GB, which means like this may be perhaps the sum of all allocations (??) ... definitely not the peak memory usage:

?summaryRprof says:

If memory = "both" the same list but with memory consumption in Mb in addition to the timings.

So mem.total is in MB

With memory = "both" the change in total memory (truncated at zero) is reported [...]

You have 8 GB RAM + 12 GB swap but mem.total proclaims you have used 50 GB?

Because it is the aggregated delta between two subsequent probes (memory usage snapshots taken by Rprof in regular time intervals: If a probe is taken while the execution is in function f the memory usage delta to the last probe is added to the mem.total of f).

The memory usage delta could be negative but I have never seen negative mem.total values so I am guessing (!) only positive values are added to mem.total.

This would explain the 50 GB total usage you are seeing: It is not the amount of allocated memory during a single point of time but the aggregated memory delta during the complete execution time.

This also explains the fact that gc only shows 3 GB as "max used (Mb)": The memory is allocated and freed/deallocated many times so that you do not run into memory pressure but this costs a lot of time (moving so much data in RAM invalidates all caches and is slow therefore) on top of the calculation logic the CPU applies.

This summary (IMHO) also seems to hide the fact that the garbage collector (gc) is starting at non-deterministic points in time to clean-up freed memory.

Since the gc starts lazy (non-deterministically) it would IMHO be unfair to attribute the negative memory deltas to a single function just probed.

I would interpret mem.total as mem.total.used.during.runtime which would possibly be a better label for the column.

profvis has a more detailed memory usage summary (as you can see in your screen shot in your question): It also aggregates the negative memory usage deltas (freed memory) but the profvis documentation also warns about the short-comings:

The code panel also shows memory allocation and deallocation. Interpreting this information can be a little tricky, because it does not necessarily reflect memory allocated and deallcated at that line of code. The sampling profiler records information about memory allocations that happen between the previous sample and the current one. This means that the allocation/deallocation values on that line may have actually occurred in a previous line of code.

A more details answer would require more research time (I don't have) - to look into the C and R source - to understand (replicate) the aggregation logic of summaryRprof based on the data files created by Rprof

Rprof data files (Rprof.out) look like this:

:376447:6176258:30587312:152:1#2 "test" 1#1 "test2"

The first four numbers (separated by colons) mean (see ?summaryRprof) - R_SmallVallocSize: the vector memory [number of buckets] in small blocks on the R heap - R_LargeVallocSize: the vector memory [number of buckets] in large blocks (from malloc) - the memory in nodes on the R heap - the number of calls to the internal function duplicate in the time interval (used to duplicate vectors eg. in case of copy-on-first-write semantics of function arguments)

The strings are the function call stack.

Only the first two number are relevant to calculate the current memory usage (of vectors) in MB:

TotalBuckets = R_SmallVallocSize + R_LargeVallocSize
mem.used = TotalBuckets * 8 Bytes / 1024 / 1024
# 50 MB in the above `Rprof` probe line:
# (376447 + 6176258) * 8 / 1024 / 1024

For details about the Vcells see ?Memory.

BTW: I wanted to try out summaryRProf(memory = "stats", diff = F) to get a current memory summary but I get an error message with R3.4.4 64-bits on Ubuntu:

Error in tapply(seq_len(1L), list(index = c("1::#File", "\"test2\":1#1",  : 
  arguments must have same length

Can you reproduce this (looks like "stats" is broken)?

来源：https://stackoverflow.com/questions/58250126/interpretation-of-memory-profiling-output-of-rprof

标签

memory-profiling