问题
I just use a valgrind to test an example provide in openmpi-1.4/example:
mpirun.openmpi --np 2 valgrind --log-file=output.dat --leak-check=full --tool=memcheck ./ring_c
then I found below in output.dat:
==30450== Syscall param writev(vector[...]) points to uninitialised byte(s)
==30450== at 0x54DC150: __writev_nocancel (syscall-template.S:81)
==30450== by 0x7E3B312: mca_oob_tcp_msg_send_handler (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
==30450== by 0x7E3C50A: mca_oob_tcp_peer_send (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
==30450== by 0x7E40266: mca_oob_tcp_send_nb (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
==30450== by 0x7C2FFB7: orte_rml_oob_send (in /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)
==30450== by 0x7C30637: orte_rml_oob_send_buffer (in /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)
==30450== by 0x824CBAE: ??? (in /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)
==30450== by 0x4E900FB: ompi_mpi_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8) ==30450== by 0x4EA8499: PMPI_Init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x4009AD: main (ring_c.c:19)
==30450== Address 0x65c0321 is 161 bytes inside a block of size 256 alloc'd
==30450== at 0x4C2DEAE: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30450== by 0x4F1E619: opal_dss_buffer_extend (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x4F1E9D0: opal_dss_copy_payload (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x4EFA3DD: orte_grpcomm_base_pack_modex_entries (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x824CA8F: ??? (in /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)
==30450== by 0x4E900FB: ompi_mpi_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x4EA8499: PMPI_Init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
==30450== by 0x4009AD: main (ring_c.c:19)
==30450== HEAP SUMMARY:
==30450== in use at exit: 298,974 bytes in 1,482 blocks
==30450== total heap usage: 7,740 allocs, 6,258 frees, 13,223,431 bytes allocated
... ... ...
==30450== LEAK SUMMARY:
==30450== definitely lost: 51,132 bytes in 69 blocks
==30450== indirectly lost: 14,378 bytes in 39 blocks
==30450== possibly lost: 0 bytes in 0 blocks
==30450== still reachable: 233,464 bytes in 1,374 blocks
==30450== suppressed: 0 bytes in 0 blocks
==30450== Reachable blocks (those to which a pointer was found) are not shown.
==30450== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==30450==
==30450== For counts of detected and suppressed errors, rerun with: -v
==30450== Use --track-origins=yes to see where uninitialized values come from
==30450== ERROR SUMMARY: 63 errors from 63 contexts (suppressed: 0 from 0)
It has memory leak based on the memorycheck results. Since the example is provided by openmpi-1.4 developers, does it mean every program using openmpi-1.4 as a libary will meet memory leak? Fred
回答1:
For performance reasons, OpenMPI is not valgrind-clean. However, as per the FAQ, a supression file is provided.
mpirun -np 2 valgrind --suppressions=$PREFIX/share/openmpi/openmpi-valgrind.supp
来源:https://stackoverflow.com/questions/35846312/openmpi-and-vargrind