nagios

Ansible send file to the first met destination

走远了吗. 提交于 2019-12-13 19:17:52
问题 I'm sending a config file for thousands of nodes, because of some customisation there's maybe 5 or 6 paths to that file (There's only one file for host but the path can vary) and there isn't a easy way to figure out the default location with facts. Based on this, I'm looking for some way to set the "dest" of copy module like we can set the "src", with a with_first_found loop. Something like that: copy: src=/foo/{{ ansible_hostname }}/nrpe.cfg dest="{{item}} with_items: - "/etc/nagios/nrpe.cfg

Nagios 如何实现自定义多功能告警

随声附和 提交于 2019-12-12 17:23:51
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> Nagios 是一个插件式的监控系统,可以监控服务的运行状态和网络信息等,并能监视所指定的本地或远程主机参数以及服务,同时提供异常告警通知功能等。Nagios 支持客户端的数据采集,通过编写客户端插件,可以获取各种监控数据,并提供了 Web 管理界面进行数据查询。其产品的主要功能侧重于监控服务的可用性,根据设置的阀值进行告警,但大部分告警逻辑都是通过监控插件实现的。 目前 Nagios 告警支持的通知方式有短信和邮件,但是很明显,这两种通知方式会导致重要级别的告警无法及时查看,并解决相关问题,而且在产生告警风暴的时候,这两种通知方式就更无法满足用户的多样化通知需求,于是如何将告警通过用户的多样化需求达到多种化通知方式,并且有效的避免告警风暴,就成了目前的首要解决问题 智能告警平台 Cloud Alert 是 睿象云 自研的一款告警管理平台,就能够完美解决这个问题!用户可以将告警通过不同的需求,指定多种化通知方式,目前支持的通知方式有电话、短信、微信、邮件、APP、钉钉等;并且当告警风暴来临时,用户可以通过勾选算法智能降噪或设置告警压缩规则,使得相同类型的告警进行压缩,有效的避免告警风暴。用户也可以将不同的监控平台接入到 Cloud Alert 内进行统一化管理。 接下来就说下是如何设置的吧~ 主要分为三大部分

Nagios (Return code of 255 is out of bounds)

半城伤御伤魂 提交于 2019-12-12 15:33:55
问题 I am getting the error (Return code of 255 is out of bounds) on the http://localhost/nagios But when I run the command from the shell, it runs fine 回答1: If you're using NRPE and receiving a 'NRPE: Unable to read output' error it is because the NRPE daemon is not returning text data. The Nagios core depends on NRPE daemons to return a string that summarizes the results of the checks. For instance, for a check result that is returning a healthy status, it may look like the former whereas an

pb nagios bash script null result in if

醉酒当歌 提交于 2019-12-11 20:32:00
问题 I have a problem with a bash script in Nagios. this is a script to get the space disk. When I used an IF section it returns (null) and when I don't set the variable in IF section, nagios display the variable correctly. I tried to run the script with nagios user and the result is good. Ex: TOTAL=`/srv/eyesofnetwork/nagios/plugins/check_nt -H $2 -p 12489 -s "" -v USEDDISKSPACE -l $4 |awk -F"-" '{print $2}' |awk '{print $2}'` if [ $TOTAL -gt 2 ] && [ $TOTAL -le 99 ];then RUN=`/srv/eyesofnetwork

Changing permissions of added file to a Docker volume

谁都会走 提交于 2019-12-11 07:49:30
问题 In the Docker best practices guide it states: You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image. And by looking at the source code for e.g. the cpuguy83/nagios image this can clearly be seen done, as everything from nagios to apache config directories are made available as volumes. However, looking at the same image the apache service (and cgi-scripts for nagios) are run as the nagios user by default. So now I'm in a pickle, as I can't seem

Unable to correctly parse out log lines containing particular keyword using nginx logwarn

♀尐吖头ヾ 提交于 2019-12-11 06:56:06
问题 With the following log line added after the last invocation of check_logwarn command - [Tue Nov 22 11:04:03 2016] [hphp] [10755:7f41af3ff700:6272:000001] [] SlowTimer [2086ms] at runtime/ext_m ysql: slow query: SELECT b.bannerid, b.campaignid FROM ox_banners b, ox_campaigns c WHERE b.campaignid = c.campaignid AND (b.status = 0 OR b.`updated` >= now() - INTERVAL 7 DAY) AND (c.status = 0 OR c.`updated` >= now() - INTERVAL 7 DAY) AND b.updated >= '2016-11-22 11:03:01'; The following logwarn

Nagios service notifications for specific hosts

*爱你&永不变心* 提交于 2019-12-11 02:48:25
问题 Using Nagios, I want to be able to send a user notifications when the services on a specific host fail. Yet those same services on another host should alert somebody else. For instance: "HostA" is up, but "www" service on "HostA" is down --> Notify UserA "HostB" is up, and "www" service on "HostB" is up --> UserB is not notified In other words... if the WWW service on HostA is down, only UserA should be notified, but if the WWW service on HostB is down, then UserB is responsible for it. I can

Linux下nagio安装与配置

十年热恋 提交于 2019-12-10 16:18:31
一、Nagios简介   Nagios是一款开源的电脑系统和网络监视工具,能有效监控Windows、Linux和Unix的主机状态,交换机路由器等网络设置,打印机等。在系统或服务状态异常时发出邮件或短信报警第一时间通知网站运维人员,在状态恢复后发出正常的邮件或短信通知。   Nagios原名为NetSaint,由Ethan Galstad开发并维护至今。NAGIOS是一个缩写形式: "Nagios Ain't Gonna Insist On Sainthood" Sainthood 翻译为圣徒,而"Agios"是"saint"的希腊表示方法。Nagios被开发在Linux下使用,但在Unix下也工作得非常好。 主要功能 网络服务监控(SMTP、POP3、HTTP、NNTP、ICMP、SNMP、FTP、SSH) 主机资源监控(CPU load、disk usage、system logs),也包括Windows主机(使用NSClient++ plugin) 可以指定自己编写的Plugin通过网络收集数据来监控任何情况(温度、警告……) 可以通过配置Nagios远程执行插件远程执行脚本 远程监控支持SSH或SSL加通道方式进行监控 简单的plugin设计允许用户很容易的开发自己需要的检查服务,支持很多开发语言(shell scripts、C++、Perl、ruby、Python、PHP

nagios监控远程windows主机

情到浓时终转凉″ 提交于 2019-12-10 16:02:58
这里用一台Ubuntu12.04的系统当nagios服务器,还有一台远程windows主机来简单介绍下。 nagios监控服务器端地址:192.168.0.20 远程windows主机地址(被监控端):192.168.0.81 注: ip地址根据具体主机ip修改,并不是固定的 介绍下原理图:( 这种监控模式是主动模式,是由nagios服务器端先发起的 ) 这里简单介绍下原理的意思: nagios监控服务器端通过执行check_nt这个脚本,然后穿过远程windows主机的防火墙,和windows主机通信。 被监控客户端要安装NSClinet++这个软件,这个服务开启,会一直监听12489端口。 可以在cmd命令行界面敲入netstat -an 可以看到12489端口正在listening 注:要想真正了解原理,可以通过wireshark抓包分析。 下面就是具体配置过程: nagios服务器端配置: 一 要在服务器端先装一些包 apt-get install -y apache2 apache2是web服务器 apt-get install -y libapache2-mod-php5 显示网页用的 apt-get install –y build-essential 提供c/c++的编译环境 apt-get install -y libgd2-xpm apt-get install

Nagios插件安装配置

て烟熏妆下的殇ゞ 提交于 2019-12-10 16:02:41
刚刚在Nagios基本平台搭建一文中也提到,对于Nagios的监测主机与服务功能其实是有其附带的插件来实现的,而Nagios系统本身只是一个空壳而已.因此,在安装了nagios平台之后的第一件事情就是将插件安装上去。这里使用的是插件源码包是从Nagios主页上下载了最新的官方插件,用的版本是nagios-plugins-1.4.13.tar.gz。 1、解压 tar zxvf nagios-plugins-1.4.13.tar.gz cd nagios-plugins-1.4.13 2、编译安装: ./configure --prefix=/usr/local/nagios \ –with-cgiurl=/usr/local/nagios/cgi-bin \ –enable-ssl \ –enable-command-args make make install 3、相关配置文件: ## 根据具体使用情况,将配置文件的结构做以下规划,为了方便将来的维护和管理: ## 配置文件结构如下: etc/ |– cgi.cfg |– commands.cfg |– nagios.cfg |– resource.cfg (以上为nagios系统主配置文件) etc/servers |– contacts.cfg 管理人员和管理人员组的的默认初始化设定文件 |– hostgroups.cfg