Triggers¶
Warning
This is currently in Beta. DO NOT use in production.
Note
Not up to date. See the bottom part for something recent
A trigger object is something that can be called after a “change” on an object. It’s a bit like Zabbix trigger, and should be used only if you need it. In most cases, direct check is easier to setup :)
It’s defined like:
Here is an example that will raise a critical check if the CPU is too loaded:
Simple rule¶
define trigger{
trigger_name One_Cpu_too_high
matching_rule perf(self, 'cpu') >= 95
hit_action critical(self, 'Cpu is too loaded')
}
Rule with an OR¶
Another one that will look if at least one CPU is too loaded (> 90% load) or the overall CPU is too loaded too (total > 60%):
define trigger{
trigger_name One_or_more_cpu_too_high
matching_rule max([perf(self, 'cpu*')]) > 90 | avg([perf(self, 'cpu*')]) > 60
hit_action critical(self, 'Cpu is too loaded')
}
Advanced correlation: active/passive cluster check¶
It can be used for advanced correlation too:
If you want to do an active/passive check without a bp_rule here an example. This service will be the “cluster” service that show the overall state. It will have 2 custom macros: “master”, the master server and “slave” the slave one.
define trigger{
trigger_name Bad_active_passive
matching_rule (service(self.customs['master']).state == 'CRITICAL' & service(self.customs['slave']).state == 'CRITICAL') | (service(self.customs['master']).state == service(self.customs['slave']).state)
hit_action critical(self, 'Cluster got a problem')
}
And if you want you can define a degraded one you can define another trigger for this same “cluster” service:
define trigger{
trigger_name Degraded_service
matching_rule service(self.customs['master']).state == 'CRITICAL' & service(self.customs['slave']).state == 'OK'
hit_action warning(self, 'Cluster runs on slave!')
}
Statefull rules¶
Here an example with statefull rules.
I will read a regexp like PORTSCAN FROM (S+) TO S+:(d+) on a service, and create an “event” that got a 60min lifetime. It will be add on services on all hosts for example.
define trigger{
trigger_name Log_post_scan
matching_rule regexp(self.output, 'PORTSCAN FROM (?P<source>\S+) TO (?P<dest>\S+):(?P<port>\d+)')
hit_action create_event('HORIZONTAL SCAN FROM SOURCE IP %s' % source, 60)
}
And a aggregated one will raise the alert if need:
define trigger{
trigger_name Raise_too_much_scans
matching_rule sources=get_events_count_group_by('HORIZONTAL SCAN FROM SOURCE IP (?P<source>\S+)'))
hit_action [critical(self, 'The IP %s scan too much ips' % source) for (source, nb) in sources.iteritems() if nb > 10]
}
Compute KPI¶
You maybe want to compute a “KPI” (key point indicator) from various sources. You can also do it with triggers.
Let take an example, You got a cluster of N webservers. Each is returning in a check the number of active connections, but you want the overall. You just need to define a new service that will take it’s data from the N others.
define trigger{
trigger_name Count_active_connections
matching_rule True;total_connections=sum(perfs('web-srv*/Http', 'active_connections'))
hit_action set_perfdata(self, 'total_connections=%d' % total_connections)
}
Define and use triggers¶
Note
More or less up to date
Use the trigger_name directive to link a trigger to a service or host. Example :
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Load trigger
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
trigger_name simple_cpu
}
Then define your trigger in etc/trigger.d/yourtrigger.trig. here the file is simple_cpu.trig
try:
load = perf(self, 'load1')
print "Founded load", load
if load >= 10:
critical(self, 'CRITICAL | load=%d' % load)
elif load >= 5:
warning(self, 'WARNING | load=%d' % load)
else:
ok(self, 'OK | load=%d' % load)
except:
unknown(self, 'UNKNOWN | load=%d' % load)
Finally, add the triggers_dir=trigger.d statement to your shinken.cfg