my33_内存满导致mysqld被kill

我的梦境 提交于 2020-11-28 09:30:31

监控报警发现MGR的一个节点故障,查看时发现LVS已经发生切换,LVS切到了MGR新的写节点上了,排查原因

/var/log/message

Mar 27 16:51:05 db10 kernel: crond invoked oom-killer: gfp_mask=0x3000d0, order=2, oom_score_adj=0
Mar 27 16:51:05 db10 kernel: crond cpuset=/ mems_allowed=0-1
Mar 27 16:51:05 db10 kernel: CPU: 35 PID: 12090 Comm: crond Tainted: G           OE  ------------   3.10.0-693.21.1.el7.x86_64 #1
Mar 27 16:51:05 db10 kernel: Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS 4.1.16 06/21/2018
Mar 27 16:51:05 db10 kernel: Call Trace:
Mar 27 16:51:05 db10 kernel: [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
Mar 27 16:51:05 db10 kernel: [<ffffffff816a9b90>] dump_header+0x90/0x229
Mar 27 16:51:05 db10 kernel: [<ffffffff810ecec2>] ? ktime_get_ts64+0x52/0xf0
Mar 27 16:51:05 db10 kernel: [<ffffffff8114140f>] ? delayacct_end+0x8f/0xb0
Mar 27 16:51:05 db10 kernel: [<ffffffff8118a884>] oom_kill_process+0x254/0x3d0
Mar 27 16:51:05 db10 kernel: [<ffffffff8118a32d>] ? oom_unkillable_task+0xcd/0x120
Mar 27 16:51:05 db10 kernel: [<ffffffff8118a3d6>] ? find_lock_task_mm+0x56/0xc0
Mar 27 16:51:05 db10 kernel: [<ffffffff8118b0c6>] out_of_memory+0x4b6/0x4f0
Mar 27 16:51:05 db10 kernel: [<ffffffff816aa694>] __alloc_pages_slowpath+0x5d6/0x724
Mar 27 16:51:05 db10 kernel: [<ffffffff811912a5>] __alloc_pages_nodemask+0x405/0x420
Mar 27 16:51:05 db10 kernel: [<ffffffff8108859d>] copy_process+0x1dd/0x1970
Mar 27 16:51:05 db10 kernel: [<ffffffff81121930>] ? audit_filter_rules.isra.8+0x280/0xf90
Mar 27 16:51:05 db10 kernel: [<ffffffff81089ee1>] do_fork+0x91/0x320
Mar 27 16:51:05 db10 kernel: [<ffffffff8108a1f6>] SyS_clone+0x16/0x20
Mar 27 16:51:05 db10 kernel: [<ffffffff816c0ad4>] stub_clone+0x44/0x70
Mar 27 16:51:05 db10 kernel: [<ffffffff816c0715>] ? system_call_fastpath+0x1c/0x21
Mar 27 16:51:05 db10 kernel: Mem-Info:
Mar 27 16:51:05 db10 kernel: active_anon:32289123 inactive_anon:180550 isolated_anon:0#012 active_file:960 inactive_file:195 isolated_file:0#012 unevictable:0 dirty:4
8 writeback:0 unstable:0#012 slab_reclaimable:59079 slab_unreclaimable:32778#012 mapped:13096 shmem:534843 pagetables:66034 bounce:0#012 free:96590 free_pcp:105 free_cma:0
Mar 27 16:51:05 db10 kernel: Node 0 DMA free:13540kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB iso
lated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kern
el_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 27 16:51:05 db10 kernel: lowmem_reserve[]: 0 1680 64143 64143
Mar 27 16:51:05 db10 kernel: Node 0 DMA32 free:250600kB min:1176kB low:1468kB high:1764kB active_anon:1442100kB inactive_anon:464kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1934208kB managed:1722948kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:1740kB slab_reclaimable:11840
kB slab_unreclaimable:7640kB kernel_stack:368kB pagetables:1132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unre
claimable? yes
Mar 27 16:51:05 db10 kernel: lowmem_reserve[]: 0 0 62462 62462
Mar 27 16:51:05 db10 kernel: Node 0 Normal free:54592kB min:43744kB low:54680kB high:65616kB active_anon:62871276kB inactive_anon:371740kB active_file:12kB inactive_f
ile:24kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65011712kB managed:63961888kB mlocked:0kB dirty:0kB writeback:0kB mapped:1028kB shmem:1190332kB slab_
reclaimable:124084kB slab_unreclaimable:45492kB kernel_stack:4768kB pagetables:92984kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 16:51:05 db10 kernel: lowmem_reserve[]: 0 0 0 0
Mar 27 16:51:05 db10 kernel: Node 1 Normal free:68040kB min:45176kB low:56468kB high:67764kB active_anon:64843172kB inactive_anon:349996kB active_file:0kB inactive_file:160kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:67108864kB managed:66056756kB mlocked:0kB dirty:192kB writeback:0kB mapped:50080kB shmem:947300kB slab_reclaimable:100392kB slab_unreclaimable:77980kB kernel_stack:28736kB pagetables:170020kB unstable:0kB bounce:0kB free_pcp:640kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:55 all_unreclaimable? no
Mar 27 16:51:05 db10 kernel: lowmem_reserve[]: 0 0 0 0
Mar 27 16:51:05 db10 kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13540kB
Mar 27 16:51:05 db10 kernel: Node 0 DMA32: 264*4kB (UEM) 403*8kB (UEM) 475*16kB (UEM) 342*32kB (UEM) 391*64kB (UEM) 300*128kB (UEM) 208*256kB (UEM) 107*512kB (UEM) 45*1024kB (EM) 5*2048kB (E) 0*4096kB = 250600kB
Mar 27 16:51:05 db10 kernel: Node 0 Normal: 13593*4kB (UEM) 22*8kB (UM) 9*16kB (M) 2*32kB (M) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 54756kB
Mar 27 16:51:05 db10 kernel: Node 1 Normal: 16649*4kB (UEM) 8*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 66660kB
Mar 27 16:51:05 db10 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Mar 27 16:51:05 db10 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 27 16:51:05 db10 kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Mar 27 16:51:05 db10 kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 27 16:51:05 db10 kernel: 535067 total pagecache pages
Mar 27 16:51:05 db10 kernel: 0 pages in swap cache
Mar 27 16:51:05 db10 kernel: Swap cache stats: add 0, delete 0, find 0/0
Mar 27 16:51:05 db10 kernel: Free swap  = 0kB
Mar 27 16:51:05 db10 kernel: Total swap = 0kB
Mar 27 16:51:05 db10 kernel: 33517692 pages RAM
Mar 27 16:51:05 db10 kernel: 0 pages HighMem/MovableOnly
Mar 27 16:51:05 db10 kernel: 578319 pages reserved
Mar 27 16:51:05 db10 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Mar 27 16:51:05 db10 kernel: [ 6050]     0  6050    35461    19476      75        0             0 systemd-journal
Mar 27 16:51:05 db10 kernel: [ 6075]     0  6075    30235       80      28        0             0 lvmetad
Mar 27 16:51:05 db10 kernel: [ 6094]     0  6094    10898      172      24        0         -1000 systemd-udevd
Mar 27 16:51:05 db10 kernel: [11985]     0 11985     4845      104      15        0             0 irqbalance
Mar 27 16:51:05 db10 kernel: [11988]   995 11988    25173       71      20        0             0 chronyd
Mar 27 16:51:06 db10 kernel: [11989]    81 11989     6709      161      21        0          -900 dbus-daemon
Mar 27 16:51:06 db10 kernel: [12004]     0 12004    31998      151      22        0             0 smartd
Mar 27 16:51:06 db10 kernel: [12006]   996 12006     2144       37      10        0             0 lsmd
Mar 27 16:51:06 db10 kernel: [12009]     0 12009   186971     9901     237        0             0 rsyslogd
Mar 27 16:51:06 db10 kernel: [12016]     0 12016     1105       39       8        0             0 rngd
Mar 27 16:51:06 db10 kernel: [12034]     0 12034     6620       99      19        0             0 systemd-logind
Mar 27 16:51:06 db10 kernel: [12068]     0 12068     5955       48      17        0             0 atd
Mar 27 16:51:06 db10 kernel: [12090]     0 12090    31058      165      19        0             0 crond
Mar 27 16:51:06 db10 kernel: [12242]     0 12242     1055       19       7        0             0 supervise
Mar 27 16:51:06 db10 kernel: [12243]     0 12243    28807       54      14        0             0 run
Mar 27 16:51:06 db10 kernel: [12260]     0 12260   139002     3217      93        0             0 tuned
Mar 27 16:51:06 db10 kernel: [12273]     0 12273    27021      242      54        0         -1000 sshd
Mar 27 16:51:06 db10 kernel: [12316]     0 12316    27523       33      10        0             0 agetty
Mar 27 16:51:06 db10 kernel: [12319]     0 12319    20378      199      38        0             0 hooagentd
Mar 27 16:51:06 db10 kernel: [12324]     0 12324    80468      586      57        0             0 hooagent
Mar 27 16:51:06 db10 kernel: [12804]     0 12804    22895      259      43        0             0 master
Mar 27 16:51:06 db10 kernel: [12831]    89 12831    22965      281      45        0             0 qmgr
Mar 27 16:51:06 db10 kernel: [13103]     0 13103   828994     4025     115        0             0 wonder-agent
Mar 27 16:51:06 db10 kernel: [20985]     0 20985   175106     1241      72        0         -1000 logmon
Mar 27 16:51:06 db10 kernel: [18570] 42583 18570    32515      159      19        0             0 screen
Mar 27 16:51:06 db10 kernel: [18571] 42583 18571    29229      485      15        0             0 bash
Mar 27 16:51:06 db10 kernel: [22385] 42583 22385    32515      153      19        0             0 screen
Mar 27 16:51:06 db10 kernel: [22386] 42583 22386    29230      485      16        0             0 bash
Mar 27 16:51:06 db10 kernel: [22416] 42583 22416    32515      154      20        0             0 screen
Mar 27 16:51:06 db10 kernel: [22417] 42583 22417    29230      485      13        0             0 bash
Mar 27 16:51:06 db10 kernel: [12032]     0 12032    28326      102      13        0             0 mysqld_safe
Mar 27 16:51:06 db10 kernel: [13363] 33173 13363 74431932 31903076   64367        0             0 mysqld
Mar 27 16:51:06 db10 kernel: [33949]     0 33949    14918     7466      33        0             0 mysqld_exporter
Mar 27 16:51:06 db10 kernel: [ 6287]     0  6287   663221     5068     121        0             0 bbmon
Mar 27 16:51:06 db10 kernel: [ 6621]    89  6621    22921      255      46        0             0 pickup
Mar 27 16:51:06 db10 kernel: [ 6957]    89  6957    22922      256      44        0             0 trivial-rewrite
Mar 27 16:51:06 db10 kernel: [ 7033]     0  7033    45072      238      45        0             0 crond
Mar 27 16:51:06 db10 kernel: [ 7045]     0  7045    28274       48      13        0             0 sh
Mar 27 16:51:06 db10 kernel: [ 7054]     0  7054   372238     1382      69        0             0 dbvip
Mar 27 16:51:06 db10 kernel: [ 7421]     0  7421    47770     1426      49        0             0 python
Mar 27 16:51:06 db10 kernel: [ 7422]     0  7422     4935      159      12        0             0 msval
Mar 27 16:51:06 db10 kernel: Out of memory: Kill process 5396 (mysqld) score 970 or sacrifice child
Mar 27 16:51:06 db10 kernel: Killed process 13363 (mysqld) total-vm:297727728kB, anon-rss:127612364kB, file-rss:0kB, shmem-rss:0kB

直接原因是下面这个mysqld进程被杀

Mar 27 16:51:06 db10 kernel: Killed process 13363 (mysqld) total-vm:297727728kB, anon-rss:127612364kB, file-rss:0kB, shmem-rss:0kB

然后往上面看,mysqld占用的内存是70多G,系统物理内存是128G

Mar 27 16:51:06 db10 kernel: [13363] 33173 13363 74431932 31903076   64367        0             0 mysqld

再往上看涉及到了node0、node1、hugepages_total,swap,这主要是numa和大页相关,先跳过这两个问题,既然这里是70多Gmysqld就被kill掉了,那我先设置mysqlbuffer_pool为 64G,先为防止该问题再出现加一道保险,然后再慢慢排查

mysql> show variables like '%pool_size%';
+-------------------------+-------------+
| Variable_name           | Value       |
+-------------------------+-------------+
| innodb_buffer_pool_size | 85899345920 |
+-------------------------+-------------+
1 row in set (0.00 sec)

mysql> select 64*1024*1024*1024;
+-------------------+
| 64*1024*1024*1024 |
+-------------------+
|       68719476736 |
+-------------------+
1 row in set (0.00 sec)

mysql> 
mysql> 
mysql> set global innodb_buffer_pool_size=68719476736;
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like '%pool_size%';
+-------------------------+-------------+
| Variable_name           | Value       |
+-------------------------+-------------+
| innodb_buffer_pool_size | 68719476736 |
+-------------------------+-------------+
1 row in set (0.00 sec)

 注意,配置文件也要修改一下;修改后OS会慢慢释放一些内存,当然,那些正在使用内存不会被释放。

再回头看这个内存问题,从日志中可以看出OS对内存处理的顺序是numa-->大页-->swap,numa内存不足,查看大页,最后查看了swap,暂时跳过numa、大页的问题,先看swap,系统想使用swap时,发现swap为0,然后就kill了mysql

Mar 27 16:51:05 db10 kernel: 535067 total pagecache pages
Mar 27 16:51:05 db10 kernel: 0 pages in swap cache
Mar 27 16:51:05 db10 kernel: Swap cache stats: add 0, delete 0, find 0/0
Mar 27 16:51:05 db10 kernel: Free swap  = 0kB
Mar 27 16:51:05 db10 kernel: Total swap = 0kB

系统想去要535067个页的内存,去swap找,结果系统没有swap,后面系统打印出各进程的内存使用情况,发现mysqld进程占用的内存最多,就杀掉了mysqld进程,从而有了可回收的内存;看到这里一个方案就出来了-->加swap,物理内存128G,加swap 64G,这里提一下swap是磁盘上的一块空间,访问swap当然没有访问内存快,同时,在内存不足与swap空间进行交互时,对OS的性能也是一种损耗,所以swap不是越多越好,这里加64G,加swap的过程略。内存向swap中放的通常是冷数据,这些冷数据是可以定期回收的。

 再看这个numa,当时安装迁移数据库的时候,安装是在centos7.4上安装的,centos7.4默认是关闭numa,确切说默认没有配置numa,直接将原系统(centos6.2)上的配置文件复制过来,做一些简单调整,就开始安装了数据库,那numa是怎么回事?

查看现运行库numa相关参数,发现该参数是ON,配置文件是复制来的,这说明复制来的配置文件是ON

mysql> show variables like '%numa%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| innodb_numa_interleave | ON    |
+------------------------+-------+
1 row in set (0.00 sec)

配置文件中有

loose_innodb_numa_interleave                                    = 1

查看mysql官方的参数说明,这个参数开启以为着系统使用了numa的MPOL_DEFAULT特性,并且要求系统开启numa功能

Enables the NUMA interleave memory policy for allocation of the InnoDB buffer pool. When innodb_numa_interleave is enabled, the NUMA memory policy is set to MPOL_INTERLEAVE for the mysqld process. After the InnoDB buffer pool is allocated, the NUMA memory policy is set back to MPOL_DEFAULT. For the innodb_numa_interleave option to be available, MySQL must be compiled on a NUMA-enabled Linux system.

查看原来的系统,原来的centos6.2系统果然是开启了numa的

$ numastat
                           node0           node1
numa_hit             48300952797     40465837517
numa_miss             8640293195        30793567
numa_foreign            30793567      8640293195
interleave_hit        1459753044      1206990706
local_node           47936393296     39695132194
other_node            9004852696       801498890

而新系统centos7.4默认没有开启numa,解决方案随之就出来了,关闭innodb_numa_interleave参数,mysql5.7.9默认关闭该参数,现版本是mysql5.7.24,删除配置文件中该参数即可,重启实例才能生效。由于前面缩小了内存,并加了swap,这里只重启读节点(读节点重启时lvs流量先切到其他节点),写节点只修改配置。由于服务器节点比较多,修改的内容也比较重要,统一修改完后,要再回头一一验证一下,以防遗漏。

关于大页,hugepages_total=0表示大页是关闭的,mysql关闭大页的参数有以下两个,默认也为关闭

 

mysql> show variables like 'large_page%';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| large_page_size | 0     |
| large_pages     | OFF   |
+-----------------+-------+
2 rows in set (0.00 sec)

 

大页关闭时的状态

$ grep Huge /proc/meminfo
AnonHugePages:  39612416 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

根据日志分析,解决方案也出来了,但真正的问题未没有解决,mysql的内存为什么会使用这么多,128G不够用?是什么占用了内存?

mysql使用内存=
key_buffer_size    
+  query_cache_size    
+  tmp_table_size    
+  innodb_buffer_pool_size    
+  innodb_additional_mem_pool_size    
+  innodb_log_buffer_size    

+  max_connections    
×( sort_buffer_size    
+  read_buffer_size    
+  read_rnd_buffer_size    
+  join_buffer_size    
+  thread_stack    
+  binlog_cache_size)

以上是mysql内存的占用部分,依次排查,发现binlog_cache_size参数在不久前从16M调整到了128M,系统的连接高时可到5000,可想相乘以后需要的内存有多大;解决方案--binlog_cache_size是会话级的,恢复该参数为16M.官方关于该参数的解释

The size of the cache to hold changes to the binary log during a transaction. A binary log cache is allocated for each client if the server supports any transactional storage engines and if the server has the binary log enabled (--log-bin option).

在一个事务运行期间,binlog_cache_size会缓存该事务修改的内存到binlog日志中。这意味着对于select语句,该参数不生效,但对于DML语句,就会按该参数分配资源,当系统的DML并发高时,占用的内存就多。

话外音:

前不久新入职一个数据库专家,给了优化的建议,大约修改了10个参数,加上这个,已经回退了两个参数。他是根据他之前公司的经验修改的,但他之前的公司没什么并发,同时对参数的含义理解也有误,不像这里单库并发能到四五千。修改现有线上系统前,一定要对修改的内容做充分的调研,最好是模拟一下线上运行环境,再进行上线。

 

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!