Shinken Manual
About
About Shinken
Feature comparison between Shinken and Nagios
Shinken notable innovations
The project Vision
Feature selection and release cycle
Release code names
Getting Started
Advice for Beginners
Quickstart Installation Guides
Installations
Upgrading Shinken
Monitoring Windows Machines
Monitoring GNU/Linux & Unix Machines
Monitoring Network Printers
Monitoring Routers and Switches
Monitoring Publicly Available Services
Nagios/Shinken Plugins
Introduction
What Are Plugins?
Shinken integrated data acquisition modules
Plugins As An Abstraction Layer
What Plugins Are Available?
Obtaining Plugins
How Do I Use Plugin X?
Plugin API
Configuring Shinken
Configuration Overview
Main Configuration File Options
Object Configuration Overview
Object Definitions
Custom Object Variables
Main advanced configuration
Running Shinken
Verifying Your Configuration
Starting and Stopping Shinken
The Basics
Setting up a basic Shinken Configuration
Nagios/Shinken Plugins
Understanding Macros and How They Work
Standard Macros in Shinken
Host Checks
Service Checks
Active Checks
Passive Checks
State Types
Time Periods
Determining Status and Reachability of Network Hosts
Notifications
Active data acquisition modules
Network dependencies
Logical dependencies
Update Shinken
Medium
Business rules
Monitoring a DMZ
Shinken High Availability
Mixed GNU/linux AND Windows pollers
Notifications and escalations
The Notification Ways, AKA mail 24x7, SMS only the night for a same contact
Passive data acquisition
Advanced Topics
External Commands
Event Handlers
Volatile Services
Service and Host Freshness Checks
Distributed Monitoring
Redundant and Failover Network Monitoring
Detection and Handling of State Flapping
Notification Escalations
On-Call Rotations
Monitoring Service and Host Clusters
Host and Service Dependencies
State Stalking
Performance Data
Scheduled Downtime
Adaptive Monitoring
Predictive Dependency Checks
Cached Checks
Passive Host State Translation
Service and Host Check Scheduling
Object Inheritance
Advanced tricks
Business rules
Migrating from Nagios to Shinken
Multi layer discovery
Multiple action urls
Aggregation rule
Scaling Shinken for large deployments
Defining advanced service dependencies
Shinken’s distributed architecture
Shinken’s distributed architecture with realms
Macro modulations
Shinken and Android
Send sms by gateway
Triggers
Unused nagios parameters
Advanced discovery with Shinken
Discovery with Shinken
Config
Host Definition
Host Group Definition
Service Definition
Service Group Definition
Contact Definition
Contact Group Definition
Time Period Definition
Command Definition
Service Dependency Definition
Service Escalation Definition
Host Dependency Definition
Host Escalation Definition
Extended Host Information Definition
Extended Service Information Definition
Notification Way Definition
Realm Definition
Arbiter Definition
Scheduler Definition
Poller Definition
Reactionner Definition
Broker Definition
Shinken Architecture
Arbiter supervision of Shinken processes
Advanced architectures
How are commands and configurations managed in Shinken
Problems and impacts correlation management
Problems and impacts correlation management
Shinken Architecture
Troubleshooting
FAQ - Shinken troubleshooting
Integration With Other Software
Integration Overview
SNMP Trap Integration
TCP Wrappers Integration
Thruk
Nagios CGI UI
Thruk interface
Use Shinken with ...
Use Shinken with Centreon
Use Shinken with Graphite
Use Shinken with Multisite
Use Shinken with Nagvis
Use Shinken with Old CGI and VShell
Use Shinken with PNP4Nagios
Use Shinken with Thruk
Use Shinken with WebUI
Security and Performance Tuning
Security Considerations
Tuning Shinken For Maximum Performance
Scaling a Shinken installation
Shinken performance statistics
How to monitor ...
Monitoring an Asterisk server
check_wmi_plus.pl for shinken on windows
Monitoring Active Directory
Monitoring a DHCP server
Monitoring Microsoft Exchange
Monitoring a IIS server
Monitoring Linux Devices
Monitoring Linux Devices
Monitoring a Linux via a Local Agent
Monitoring a Linux via SNMP
Monitoring Publicly Available Services
Monitoring a printer
Monitoring Network Devices
Monitoring Windows Devices
Monitoring Microsoft Mssql server
Monitoring MySQL
Monitoring VMware Machines
Monitoring Microsoft Mssql server
Monitoring MySQL
Monitoring Publicly Available Services
Monitoring an Oracle database server
Monitoring a printer
Monitoring Network Devices
Monitoring VMware Machines
Monitoring Windows witn NSClient++
Monitoring Windows Devices
How to contribute
Shinken packs
Shinken modules and Shinken packs
Help the Shinken project
Getting Help and Ways to Contribute
Shinken Package Manager
Development
Shinken Programming Guidelines
Test Driven Development
Nagios Plugin API
Developing Shinken Daemon Modules
Hacking the Shinken Code
Shinken modules
Amazon AWS/EC2 import
Amazon AWS/EC2 import
The distributed retention modules
How to enable and use Livestatus module
Exporting data for reporting
Monitoring Linux System with Glances and checkglances.py
Shinken GLPI integration
Ip Tag module
Ubuntu Landscape import
Shinken Livestatus API
NSCA module
Retention troubleshooting
NRPE Module
Extending Shinken
Broker modules
TSCA (Thrift Service Check Acceptor)
VMWare Arbiter module
Web Service Module
SNMP module
WebUI module
Reference
shinken
shinken Package
clients Package
daemons Package
discovery Package
misc Package
objects Package
webui Package
Shinken Manual
Docs
»
Advanced Topics
Edit on GitHub
Advanced Topics
ΒΆ
External Commands
Introduction
Enabling External Commands
When Does Shinken Check For External Commands?
Using External Commands
Command Format
Event Handlers
Introduction
When Are Event Handlers Executed?
Event Handler Types
Enabling Event Handlers
Event Handler Execution Order
Writing Event Handler Commands
Permissions For Event Handler Commands
Service Event Handler Example
Volatile Services
Introduction
What Are They Useful For?
What’s So Special About Volatile Services?
The Power Of Two
Shinken Configuration
PortSentry Configuration
Port Scan Script
Service and Host Freshness Checks
Introduction
How Does Freshness Checking Work?
Enabling Freshness Checking
Example
Distributed Monitoring
Introduction
Goals
The global architecture
Shinken Daemon roles
The smart and automatic load balancing
Creating independent packs
The packs aggregations into scheduler configurations
The configurations sending to satellites
The high availability
When a node dies
External commands dispatching
Different types of Pollers: poller_tag
Use cases
Different types of Reactionners: reactionner_tag
Advanced architectures: Realms
Realms in few words
Realms are not poller_tags!
Sub realms
Example of realm usage
Redundant and Failover Network Monitoring
Introduction
Detection and Handling of State Flapping
Introduction
How Flap Detection Works
Example
Flap Detection for Services
Flap Detection for Hosts
Flap Detection Thresholds
States Used For Flap Detection
Flap Handling
Enabling Flap Detection
Notification Escalations
Introduction
When Are Notifications Escalated?
Contact Groups
Overlapping Escalation Ranges
Recovery Notifications
Notification Intervals
Escalations based on time
Escalations based on time short time
Time Period Restrictions
State Restrictions
On-Call Rotations
Introduction
Scenario 1: Holidays and Weekends
Scenario 2: Alternating Days
Scenario 3: Alternating Weeks
Scenario 4: Vacation Days
Other Scenarios
Monitoring Service and Host Clusters
Introduction
Host and Service Dependencies
Introduction
Service Dependencies Overview
Defining Service Dependencies
Example Service Dependencies
How Service Dependencies Are Tested
Execution Dependencies
Notification Dependencies
Dependency Inheritance
Host Dependencies
Example Host Dependencies
State Stalking
Introduction
How Does It Work?
Should I Enable Stalking?
How Do I Enable Stalking?
How Does Stalking Differ From Volatile Services?
Caveats
Performance Data
Introduction
Types of Performance Data
Plugin Performance Data
Processing Performance Data
Processing Performance Data Using Commands
Writing Performance Data To Files
Scheduled Downtime
Introduction
Scheduling Downtime
Fixed vs. Flexible Downtime
Triggered Downtime
How Scheduled Downtime Affects Notifications
Overlapping Scheduled Downtime
Adaptive Monitoring
Introduction
What Can Be Changed?
External Commands For Adaptive Monitoring
Predictive Dependency Checks
Introduction
How Do Predictive Checks Work?
Enabling Predictive Checks
Cached Checks
Cached Checks
Introduction
For On-Demand Checks Only
How Caching Works
What This Really Means
Configuration Variables
Optimizing Cache Effectiveness
Passive Host State Translation
Introduction
Service and Host Check Scheduling
The scheduling
Object Inheritance
Introduction
Basics
Local Variables vs. Inherited Variables
Inheritance Chaining
Using Incomplete Object Definitions as Templates
Custom Object Variables
Cancelling Inheritance of String Values
Additive Inheritance of String Values
Implied Inheritance
Implied/Additive Inheritance in Escalations
Multiple Inheritance Sources
Precedence With Multiple Inheritance Sources
Advanced tricks
Time-Saving Tricks For Object Definitions
Introduction
Service Definitions
Multiple Hosts:
All Hosts In Multiple Hostgroups:
All Hosts:
Excluding Hosts:
Service Escalation Definitions
Multiple Hosts:
All Hosts In Multiple Hostgroups:
All Hosts:
Excluding Hosts:
All Services On Same Host:
Multiple Services On Same Host:
All Services In Multiple Servicegroups:
Service Dependency Definitions
Multiple Hosts:
All Hosts In Multiple Hostgroups:
All Services On A Host:
Multiple Services On A Host:
All Services In Multiple Servicegroups:
Same Host Dependencies:
Host Escalation Definitions
Multiple Hosts:
All Hosts In Multiple Hostgroups:
All Hosts:
Excluding Hosts:
Host Dependency Definitions
Multiple Hosts:
All Hosts In Multiple Hostgroups:
Hostgroups
All Hosts:
Business rules
View your infrastructure from a business perspective
How to define Business Rules?
With “need at least X elements” clusters
The NOT rule
Manage degraded status
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
Classic cases
Migrating from Nagios to Shinken
How to to import existing Nagios states
How to to import Nagios reporting data
Multi layer discovery
Runners available
Filesystems
Pre-requisites
How it works:
Macros mode.
Tag mode
Cluster
Pre-requisites
How it works
Multiple action urls
Aggregation rule
Goal
Sample 1
version 2 (tag based agregation)
Scaling Shinken for large deployments
Planning your deployment
How scalable is Shinken
Passive versus Active
Scaling the data acquisition
Scaling the broker
Web Interface
Dependancy model
Scaling the acquisition daemons
Active acquisition methods
Scaling SNMP acquisition
Scaling NRPE acquisition
Passive acquisition methods
Scaling metric acquisition
Log management methods
SLA reporting methods
Practical optimization tips
Defining advanced service dependencies
Example Service Dependencies
How Service Dependencies Are Tested
Execution Dependencies
Notification Dependencies
Dependency Inheritance
Host Dependencies
Example Host Dependencies
Shinken’s distributed architecture
Shinken’s distributed architecture for load balancing
Setup a load balancing architecture with some pollers
Install the poller on the new server
Declare the new poller on the main configuration file
Shinken’s distributed architecture with realms
Multi customers and/or sites: REALMS
Sub-realms
An example
Picture example
Configuration of the realms
Multi levels brokers
Macro modulations
How macros modulations works
How to define a macro_modulation
Shinken and Android
Sending SMS
Install Python on your phone
Install the Pyro lib on your phone
Install Shinken on your phone
Time to launch the Shinken app on the phone
Declare this daemon in the central configuration
Add SMS notification ways
Add SMS to your contacts
Receive SMS: acknowledge with a SMS
Pre-requite
How to send ACK from SMS?
Send sms by gateway
1. you need to go to your contact.cfg who is for linux in /usr/local/shinken/etc/contacts.cfg
2. you need to go to your commands.cfg who is in /usr/local/shinken/etc/commands.cfg
3. Add the script
4. Test It
Triggers
Simple rule
Rule with an OR
Advanced correlation: active/passive cluster check
Statefull rules
Compute KPI
Define and use triggers
Unused nagios parameters
External Command Check Interval (Unused)
External Command Buffer Slots (Not implemented)
Use Retained Program State Option (Not implemented)
Use Retained Scheduling Info Option (Not implemented)
Retained Host and Service Attribute Masks (Not implemented)
Retained Process Attribute Masks (Not implemented)
Retained Contact Attribute Masks (Not implemented)
Service Inter-Check Delay Method (Unused)
Inter-Check Sleep Time (Unused)
Service Interleave Factor (Unused)
Maximum Concurrent Service Checks (Unused)
Check Result Reaper Frequency (Unused)
Maximum Check Result Reaper Time
Check Result Path (Unused)
Max Check Result File Age (Unused)
Host Inter-Check Delay Method (Unused)
Auto-Rescheduling Option (Not implemented)
Auto-Rescheduling Interval (Not implemented)
Auto-Rescheduling Window (Not implemented)
Aggressive Host Checking Option (Unused)
Translate Passive Host Checks Option (Not implemented)
Child Process Memory Option (Unused)
Child Processes Fork Twice (Unused)
Event Broker Options (Unused)
Event Broker Modules (Unused)
Debug File (Unused)
Debug Level (Unused)
Debug Verbosity (Unused)
Maximum Debug File Size (Unused)
Advanced discovery with Shinken
How the discovery script works
Discovery scripts
Discovery rules
Host rule
Service rule
The ! (not) key
Add something instead of replace
Delete something after add
Discovery with Shinken
Simple use of the discovery tool
Setup nmap discovery
Setup the VMware part
Launch it!
Restart Shinken
More about discovery
Read the Docs
v: 1.4.2
Versions
latest
stable
branch-1.4
2.4.1
2.4
2.4---4
2.4---3
2.4---2
2.4---1
2.4-----1
2.2
2.2---1
2.0.3
1.4.2
Downloads
On Read the Docs
Project Home
Builds
Free document hosting provided by
Read the Docs
.