近期ganglia监控服务器有点异常,从ambari中看到Server端服务和本机Client端服务都是关闭状态,启动后过一会也会自动关闭,
查看日志信息:
Sep 9 05:58:31 localhost kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Sep 9 06:06:17 localhost nagios: Auto-save of retention data completed successfully.
Sep 9 06:09:49 localhost ntpd[11819]: synchronized to 129.6.15.28, stratum 1
Sep 9 06:45:56 localhost /usr/sbin/gmond[31416]: Unable to create tcp_accept_channel. Exiting.#012
Sep 9 07:32:35 localhost /usr/sbin/gmond[12673]: Unable to create tcp_accept_channel. Exiting.#012
Sep 9 08:41:00 localhost /usr/sbin/gmond[13840]: Unable to create tcp_accept_channel. Exiting.#012
Sep 9 08:44:19 localhost /usr/sbin/gmetad[9939]: data_thread() for [my cluster] failed to contact node 127.0.0.1
Sep 9 08:44:19 localhost /usr/sbin/gmetad[9939]: data_thread() got no answer from any [my cluster] datasource
Sep 9 08:44:33 localhost /usr/sbin/gmetad[22468]: data_thread() for [my cluster] failed to contact node 127.0.0.1
Sep 9 08:44:33 localhost /usr/sbin/gmetad[22468]: data_thread() got no answer from any [my cluster] datasource
Sep 9 08:44:43 localhost /usr/sbin/gmond[22863]: Unable to find the metric information for 'procs_blocked'. Possible that the module has not been loaded.#012
Sep 9 08:44:43 localhost /usr/sbin/gmond[22863]: Unable to find the metric information for 'procs_created'. Possible that the module has not been loaded.#012
Sep 9 08:44:43 localhost /usr/sbin/gmond[22863]: Unable to find any metric information for 'softirq_(.+)'. Possible that a module has not been loaded.#012
无法与127.0.0.1通信,查看hosts有这个解析,
后来尝试登录到Server服务器上,
service gmetad stop
service gmond stop
之后从ambari端start Server 和Client端,启动成功!
可能是ambari通信问题,解决就好
阅读(3252) | 评论(0) | 转发(0) |