Chinaunix首页 | 论坛 | 博客
  • 博客访问: 273659
  • 博文数量: 67
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 620
  • 用 户 组: 普通用户
  • 注册时间: 2015-07-12 19:56
文章分类

全部博文(67)

文章存档

2019年(1)

2018年(1)

2017年(4)

2016年(34)

2015年(27)

我的朋友

分类: LINUX

2015-08-21 14:51:39

  Teamtrack ID (Bug/RFE/Task):
           BUG86127:   klogd stuck consuming CPU

        Root Cause (Bug) or Purpose (RFE/Task):
           two points:
           (1). Currently, the klogd can’t process the full buffer that doesn’t include the end character ‘\n’.  In this case, it will lead the death cycle of klogd.  It means that the klogd sticks up CPU.
           (2). What logs are so large and not include the end character in the firewall sessions of kernel module.

           For #1, you can obtain the more answers from the CL480815.
           For #2, there is NO rate limit about plenty of repeat kernel logs. Sometimes some parts of the printk will produce a large number of output, resulting in the kernel ring buffer overflow and old information is quickly covered. Maybe it leads in that the kernel ring buffer does not include ‘\n’. At the same time, the klogd gets all the kernel logs by invoking syslog and will not parse the logs.  It leads to reproduce #1.
           The answer is not that a complete log, but many logs are covered in each other. Eventually forming a huge log which doesn't include the end character.

        Solution:
           For plenty of repeat kernel logs, there will be rate limit in the firewall sessions of kernel module. Be similar to the function "net_ratelimit" in the network module, we use the function "printk_ratelimit" instead and control the rate by operating the value of the files "/proc/sys/kernel/printk_ratelimit" and "/proc/sys/kernel/printk_ratelimit_burst".

           Note: This solution will cause some sessions delete/idle time logs lost when a lot of sessions are online at the same time.

        Unit test:
           (1). Replace the xt_session.ko in XTM2520 device.
           (2). Continually create plenty of 4100 authentication sessions and keep 30W sessions online at the same time.
           (3). Check sessions and the rate limit of printk. The following is a snapshot.

           wgsupport@Member2:/proc/sys/kernel# cat  printk_ratelimit
           5
           wgsupport@Member2:/proc/sys/kernel# cat  printk_ratelimit_burst
           10
           wgsupport@Member2:/proc/sys/kernel# wgcmd status /authentication/list type=5 | grep user_id | wc
           358252    358252  11491124
           wgsupport@Member2:/proc/sys/kernel# top

           top - 15:04:23 up 18:50,  0 users,  load average: 8.63, 9.05, 9.17
           Tasks: 185 total,  12 running, 173 sleeping,   0 stopped,   0 zombie
           Cpu(s): 21.2%us, 44.6%sy,  0.0%ni, 32.5%id,  0.1%wa,  0.0%hi,  1.6%si,  0.0%st
           Mem:  32828280k total,  7607828k used, 25220452k free,    31640k buffers
           Swap: 16383996k total,        0k used, 16383996k free,   285608k cached

             PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
            3117 nobody     5 -15 5852m 5.6g 2760 R   78 18.0   1085:01 sessiond
            3129 wgsuppor   5 -15  101m 4916 4016 R   75  0.0 646:36.27 l2tp-wrap
            3124 wgsuppor   5 -15 83952 4128 3356 R   71  0.0 653:37.15 fwagent
            3111 wgsuppor   5 -15 89348 4092 3304 R   70  0.0 634:09.75 sslvpn-client-w
            3128 wgsuppor   5 -15 84316 4104 3352 R   68  0.0 650:07.93 snmp-wrap
            2948 wgsuppor   5 -15 73524 3536 2856 R   68  0.0 640:03.23 pimd-wrap
            2759 wgsuppor   5 -15 93952 8072 4116 R   63  0.0 719:11.23 loggerd
               4 wgsuppor  20   0     0    0    0 R   57  0.0 182:37.68 kworker/0:0
              10 wgsuppor  20   0     0    0    0 R   52  0.0 230:30.60 rcuc/0
              15 wgsuppor  20   0     0    0    0 S    3  0.0   4:47.71 ksoftirqd/1

           (4) Check the kernel logs by the command "tail -f /proc/kmsg" and all logs are normal. The following is a snapshot.

           <7>[48323.430678] xt_session: session_entry_expiry: Queuing multicast for idle timeout of "session" list entry for 10.2.58.5
           <7>[48323.430681] xt_session: session_entry_expiry: Queuing multicast for idle timeout of "session" list entry for 10.2.58.6
           <7>[48323.430684] xt_session: session_entry_expiry: Queuing multicast for idle timeout of "session" list entry for 10.2.58.7
           <7>[48323.430686] xt_session: session_entry_expiry: Queuing multicast for idle timeout of "session" list entry for 10.2.58.18

           (5) Modify the rate limit to increase it and check the kernel logs that is abnormal. The following is a snapshot.

           wgsupport@Member2:/proc/sys/kernel# echo 1 > printk_ratelimit
           wgsupport@Member2:/proc/sys/kernel# echo 999 > printk_ratelimit_burst
           wgsupport@Member2:/proc/sys/kernel#
           wgsupport@Member2:/proc/sys/kernel# cat printk_ratelimit
           1
           wgsupport@Member2:/proc/sys/kernel# cat printk_ratelimit_burst
           999
           wgsupport@Member2:/proc/sys/kernel# top

           top - 15:13:22 up 18:59,  0 users,  load average: 9.26, 9.57, 9.45
           Tasks: 185 total,  10 running, 175 sleeping,   0 stopped,   0 zombie
           Cpu(s): 23.9%us, 46.6%sy,  0.0%ni, 28.3%id,  0.0%wa,  0.0%hi,  1.2%si,  0.0%st
           Mem:  32828280k total,  7683988k used, 25144292k free,    31668k buffers
           Swap: 16383996k total,        0k used, 16383996k free,   285776k cached

             PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
            3117 nobody     5 -15 5874m 5.7g 2760 S   94 18.1   1093:46 sessiond
            3124 wgsuppor   5 -15 83952 4128 3356 R   73  0.0 659:47.52 fwagent
            3128 wgsuppor   5 -15 84316 4104 3352 R   73  0.0 656:17.56 snmp-wrap
            3111 wgsuppor   5 -15 89348 4092 3304 R   72  0.0 640:08.83 sslvpn-client-w
            3129 wgsuppor   5 -15  101m 4916 4016 R   71  0.0 652:41.89 l2tp-wrap
            2948 wgsuppor   5 -15 73524 3536 2856 R   69  0.0 646:06.99 pimd-wrap
            2759 wgsuppor   5 -15 93952 8072 4116 R   66  0.0 725:59.86 loggerd
               4 wgsuppor  20   0     0    0    0 R   56  0.0 184:29.19 kworker/0:0
               3 wgsuppor  20   0     0    0    0 R   56  0.0 189:21.73 ksoftirqd/0
              26 wgsuppor  RT   0     0    0    0 S    6  0.0   1:28.74 migration/3
              14 wgsuppor  RT   0     0    0    0 S    3  0.0   1:36.88 migration/1

           wgsupport@Member2:/proc/sys/kernel# tail -f /proc/kmsg
         ...

        <7>[ 1102.564786] xt_session: Session for 10.2.47.249 with ID 253911 aged out from session list
        <7>[ 1102.564787] xt_session: session_entry_expiry: Queuing multicast for idle timeout of "session" list entry for 10.2.47.249
        <7>[ 1102.564789] xt_session: Sestry for 10.2.565298]try for 10.2.4.565302] ry for 10.2.5305for 10.2.48.152
        <7>[8] xt_sr 10.2.48.153] xt_s10.2.48.154
        <t_sessi2.48.155
        <7>ession8.156
        <7>[ 11ion: se157
        <7>[ 1102.565323] on: ses8
        <7>[ 1102.565 session7>[ 1102.5653ssion_[ 1102.565333]sion_en1102.565336] xtn_entr2.565340] xt_session: Sessiontry_565343] xt_session:try_ex5347] xt_session:_expi0] xt_session: piry:xt_session: Sey: Que_session: Session: Queuinsession: Sess Queuission: Sessieuiion: Sessionuing n: Session fog multSession for 1ulticsion for 10.icast fn for 10.2.48.st for ior 10.2.48.176 for id10.2.48.177 wit idle 2.48.178 with Ile time8.179 with IDtimeout 80 with ID 254095 aged out freout of "sesswith ID 254096 aged ot of "sessh ID 254097 aged outf "session" lisID 254098 aged out from  "sessio 254099 aged ession"54100 aged ossion"101 aged out froon" li aged out fro list enged out from selist end out from sesst entryout from session list
        <7>[ 1102.565431] xt_ entry ft from session try ffrom session liry foom session listfor 1session list
        < 10.2.48ssion list
        <7>10.2.48.sion list
        <7>0.2.48on list
        <7>[ 11.48.1list
        <7>[ 1102..198
        <7t
        <7>[ 1102.59
        <7>[7>[ 1102.565467>[ 1[ 1102.565468>[ 11 1102.565471][ 1102102.565474] x102.52.565477] xt_se02.5652.565481] xt_102.52.565484] xt02.5654.565487] xt_session: session_entry_expiry: Queuing multicast for idle 2.56548565491] xt_sessi.565495494] xt_sessio5496]7] xt_session:9] xt_xt_session: xt_sessisession: sesssession: Session for sion: session_esion: n: session_enn: Sess: session_enton: Sess: session_entry_expiryn: Sess session_entry_expiry: Queu: Sessi session_entry_expiry: Qn: Sessisession_entry_exSessiosion_entry_exsion fon_entry_expion f_entry_expiry for 1ntry_expiry: for 1ry_expiry: Q 10.2.48expiry: Queu.2.48.2iry: Queuing mu48.22y: Queuing mu8.228  Queuing multi29 wieuing multicast fwith ing multicasith ID g multicast for ID 2ulticast for  2541icast for id4149 st for idle timeout of "session" list entry for 10.2.48.234    


阅读(1228) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~