Valgrind 编译器依赖与内存泄漏类型再探

利用 Valgrind 检测自定义类中内存分配和释放情况.

Posted Jul 5, 2019 Updated Nov 16, 2024

By Min-Ye Zhang

14 min read

背景

最近考虑重构 GAP3 代码, 于是学习了有关 Fortran 面向对象编程的知识, 接触到了设计模式 (Design Pattern) 的概念. 其中使用自定义类和用委派关系实现继承是自己之前很少在 Fortran 中用的, 主要还是面向过程的编程思维. 事实上面向对象的思维也是在研究生后学 Python 过程中慢慢转过去的. 有关设计模式的学习内容以后有机会再整理上来.

这篇文章算是记录一点点在 Fortran 中进行面向对象编程的实践, 主要用的是 main.f90 和 mytypes.f90 这两段非常短的代码.

mytypes.f90 包含一个模块, 其中定义了 myarrays 类, 其数据包含两个可分配数组, 分别是一维整型数组和二维浮点数数组, 并定义了相关 constructor 和 destructor 例程.
main.f90 是主程序, 仅调用 constructor 和 destructor 方法, 因此原则上没有内存泄漏.

接下来就是用 Valgrind 作内存检测, 看一看. 编译用的 Makefile 在这里, 编译得到的可执行程序是 test. 测试平台是 Fedora 27.

依赖编译器的 Valgrind 报告

gfortran 编译

使用 gfortran (GCC 7.3.1) 编译得到的 test, Valgrind 检测没有报错, 但堆调用中的 alloc 数为 23, 比 new_my_array 例程中 allocate 语句 (2) 要多很多.

$ valgrind --leak-check=full --show-leak-kinds=all ./test
==10854== Memcheck, a memory error detector
==10854== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10854== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==10854== Command: ./test
==10854==
==10854==
==10854== HEAP SUMMARY:
==10854==     in use at exit: 0 bytes in 0 blocks
==10854==   total heap usage: 23 allocs, 23 frees, 13,520 bytes allocated
==10854==
==10854== All heap blocks were freed -- no leaks are possible
==10854==
==10854== For counts of detected and suppressed errors, rerun with: -v
==10854== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Intel Fortran 编译

用 Intel Fortran (2018 update 1) 编译, 堆调用中的 alloc 数为 4, 虽然也大于 2 但比 gfortran 里的 23 要小. 此外, Valgrind 报告了 32 bytes 的 “still reachable” 泄漏, 这一泄漏和该版本 Fedora 中 glibc 的 bug 有关. 没有报错.

$ valgrind --leak-check=full --show-leak-kinds=all ./test
==13583== Memcheck, a memory error detector
==13583== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==13583== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==13583== Command: ./test
==13583==
==13583==
==13583== HEAP SUMMARY:
==13583==     in use at exit: 32 bytes in 1 blocks
==13583==   total heap usage: 4 allocs, 3 frees, 152 bytes allocated
==13583==
==13583== 32 bytes in 1 blocks are still reachable in loss record 1 of 1
==13583==    at 0x4C2F01A: calloc (vg_replace_malloc.c:752)
==13583==    by 0x5971714: _dlerror_run (in /usr/lib64/libdl-2.26.so)
==13583==    by 0x5971129: dlsym (in /usr/lib64/libdl-2.26.so)
==13583==    by 0x41165E: real_aio_init (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583==    by 0x40849B: for__once_private (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583==    by 0x4066B4: for_rtl_init_ (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583==    by 0x402948: main (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583==
==13583== LEAK SUMMARY:
==13583==    definitely lost: 0 bytes in 0 blocks
==13583==    indirectly lost: 0 bytes in 0 blocks
==13583==      possibly lost: 0 bytes in 0 blocks
==13583==    still reachable: 32 bytes in 1 blocks
==13583==         suppressed: 0 bytes in 0 blocks
==13583==
==13583== For counts of detected and suppressed errors, rerun with: -v
==13583== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

接下来做一些小的实验. 如果在主程序中特意省略掉 destructor, 会得到 104 bytes 的 “possibly lost”, 同时 Error Summary 中出现两个错误. 比较奇怪的是, 原则上当 rank 为 2 时, 2 个整型和 4 个浮点数对应的内存损失为 24 bytes.

进一步实验

将 rank 从 2 增加到 4, 损失增加到 160 bytes. 原则上应该是 80 (4 整型, 16 浮点数).
增加另一个 myarrays 对象, 损失增加到 208 bytes.
修改 destructor 方法 destroy_my_array, 跳过二维数组 rarr2d 的 deallocate, 在主程序中调用 destructor. 此时内存损失为 56 (rank=2)和 104 (rank 4) bytes.

这表明有 80 bytes 好像被”附着”在每个自定义类的对象上. 更具体的, 每个可分配数组”附着”了 40 bytes 的内存.

回看 gfortran

现在回到 gfortran 编译上, 也是有意地去掉 destructor, 看看 Valgrind 如何响应.

当 rank=2 时, Valgrind 报告了 24 bytes 的 “still reachable” 泄漏, 没有报错. 这个泄漏量和根据数据类型预计的量是一样的, 与此同时 Valgrind 类认为这一内存泄漏是不构成关键的性能问题.

$ valgrind --leak-check=full --show-leak-kinds=all ./test
==16808== Memcheck, a memory error detector
==16808== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16808== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==16808== Command: ./test
==16808==
==16808==
==16808== HEAP SUMMARY:
==16808==     in use at exit: 24 bytes in 2 blocks
==16808==   total heap usage: 23 allocs, 21 frees, 13,520 bytes allocated
==16808==
==16808== 8 bytes in 1 blocks are still reachable in loss record 1 of 2
==16808==    at 0x4C2CDCB: malloc (vg_replace_malloc.c:299)
==16808==    by 0x400F25: __mytypes_MOD_new_my_array (mytypes.f90:17)
==16808==    by 0x40116C: MAIN__ (main.f90:8)
==16808==    by 0x4011AF: main (main.f90:3)
==16808==
==16808== 16 bytes in 1 blocks are still reachable in loss record 2 of 2
==16808==    at 0x4C2CDCB: malloc (vg_replace_malloc.c:299)
==16808==    by 0x4010C1: __mytypes_MOD_new_my_array (mytypes.f90:20)
==16808==    by 0x40116C: MAIN__ (main.f90:8)
==16808==    by 0x4011AF: main (main.f90:3)
==16808==
==16808== LEAK SUMMARY:
==16808==    definitely lost: 0 bytes in 0 blocks
==16808==    indirectly lost: 0 bytes in 0 blocks
==16808==      possibly lost: 0 bytes in 0 blocks
==16808==    still reachable: 24 bytes in 2 blocks
==16808==         suppressed: 0 bytes in 0 blocks
==16808==
==16808== For counts of detected and suppressed errors, rerun with: -v
==16808== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

“内存泄漏”再探

在检索上面 still reachable leak 问题的时候, 发现了 SO 上关于的一个回答

There is more than one way to define “memory leak”. In particular, there are two primary definitions of “memory leak” that are in common usage among programmers.
The first commonly used definition of “memory leak” is, “Memory was allocated and was not subsequently freed before the program terminated.” However, many programmers (rightly) argue that certain types of memory leaks that fit this definition don’t actually pose any sort of problem, and therefore should not be considered true “memory leaks”.
An arguably stricter (and more useful) definition of “memory leak” is, “Memory was allocated and cannot be subsequently freed because the program no longer has any pointers to the allocated memory block.” In other words, you cannot free memory that you no longer have any pointers to. Such memory is therefore a “memory leak”. Valgrind uses this stricter definition of the term “memory leak”. This is the type of leak which can potentially cause significant heap depletion, especially for long lived processes.
The “still reachable” category within Valgrind’s leak report refers to allocations that fit only the first definition of “memory leak”. These blocks were not freed, but they could have been freed (if the programmer had wanted to) because the program still was keeping track of pointers to those memory blocks.
In general, there is no need to worry about “still reachable” blocks. They don’t pose the sort of problem that true memory leaks can cause. For instance, there is normally no potential for heap exhaustion from “still reachable” blocks. This is because these blocks are usually one-time allocations, references to which are kept throughout the duration of the process’s lifetime. While you could go through and ensure that your program frees all allocated memory, there is usually no practical benefit from doing so since the operating system will reclaim all of the process’s memory after the process terminates, anyway. Contrast this with true memory leaks which, if left unfixed, could cause a process to run out of memory if left running long enough, or will simply cause a process to consume far more memory than is necessary.

翻译如下

定义 “内存泄漏” 的方式不止一种. 特别的, 在程序员间常用的主要有两种 “内存泄漏” 的定义.
第一种常用的定义是, “内存被分配, 随后没有在程序结束前被释放”. 但是, 很多程序员 (正确地) 主张说符合这一定义的内存泄漏并不会造成问题, 因此并不被认为是真正的内存泄漏.
“内存泄漏”的一种可能更为严格(也更有用)的定义是, “内存被分配后, 由于程序失去了指向被分配内存块的指针而无法被释放”. 换句话说, 你无法释放没有指针指向的内存. 所以这样的内存属于”内存泄漏”. Valgrind 用的是这一更为严格的定义. 这类泄漏可能产生严重的堆损耗, 特别是在长期活动的进程中.
Valgrind 的泄漏报告中 “still reachable” 分类指的是只满足第一类定义的内存分配. 这些内存块没有被释放, 但他们是可以被释放的(只要程序员愿意), 因为程序仍然保有指向这些内存块的指针.
一般而言, 不必担心 “still reachable” 的内存块. 他们不会带来真正的内存泄漏会导致的问题. 比如说, “still reachable” 的内存块通常不会导致堆耗尽. 这是因为这些块都是单次分配, 程序在整个生命周期中都保留对他们的指向. 你当然可以梳理整个程序, 保证这些内存块都被释放, 但这实际并没什么好处, 因为操作系统会在进程结束后回收进程的全部内存. 与之相对, 如果真正的内存泄漏没有被修正, 那么就会导致一个进程在运行足够长时间后耗尽所有内存, 或者说消耗比它所必需的多得多的内存.

这是对之前 Memcheck 初探一文最后泄漏类型梳理的重要补充. 答主非常细心的区分了两种内存泄漏的类型. 我们重新来看当时的 abc 程序

  
program abc
    integer :: i
    integer, allocatable :: data(:)

    allocate(data(5))
    print*, rank(data), size(data), loc(data)
    do i = 1, 5
        data(i-1) = i
    end do
    print*, data(1)
    print*, rank(data), size(data), loc(data)
end program abc

并将 data 越界赋值语句注释. 用 gfortran 编译会得到 20 bytes 的 definite loss. 如果用 ifort, 则会得到 60 bytes 的 possibly lost. 令人摸不着头脑的是, 如果把这一段代码放到 main.f90 中, 注释掉原来的 myarrays 的部分, 同样用 gfortran 编译, 得到的是 20 bytes 的 still reachable leak. ifort 仍给出 60 bytes 的 possibly lost.

总结

从以上非常直接的例子里可以得到的两个结论, 首先是 do not oversmart your compiler. 跟人类语言互译一样, 不同编译器可能将一段高级语言翻译成风格不同的机器码, 这可能就是导致 Valgrind 检测结果不同的原因. 其次, 也是很自然的, 既然编译器存在这样的不确定性, 那么编程人员就应该写好内存分配和释放的语句, 从源头减少这样的不确定性.

tool

Valgrind Fortran

This post is licensed under CC BY 4.0 by the author.