SnmpBooster Troubleshooting

Check your config

  • Have you defined the poller module name?
  • Have you defined the correct path to the directory containing your Defaults*.ini files?
  • Have you addded the Snmpbooster module to your arbiter, poller, scheduler?
  • Have you added copied the genDevConfig templates.cfg in shinken/packs/network/SnmpBooster/
  • Have you installed PySNMP, memcached and other dependencies?

Software version consistency

Shinken and SnmpBooster now require the same Python and Pyro version on all hosts running a Shinken daemon.

If you cannot use the packaged version of Python and its modules (Pyro, memcached, etc.). Use virtualenv to declare a python version to use and install all required modules in that virtualenv.

Software version requirements

Have you verified that the requirements are met. Python, PySNMP, Shinken, Pyro, memcached, etc.

Validate your check command arguments

Use the check_plugin command and comment out the module to learn what were the exact arguments sent by the poller. This will permit you to validate all the arguments, like snmp community string, inheritance, template application, etc.

Validate connectivity

Take a packet trace using a tool like Wireshark to validate that the remote host is responding.
  • Has the host responded

  • Is SnmpBooster repeating the request more often than the polling interval.
    • If you are seeing repeated requests your device may have a compatibility issues.
    • Save an snmpwalk of the device, get a packet trace using Wireshark, set the poller to debug and save the poller.log file (/var/log/shinken/pollerd.log). Send all three to the SnmpBooster developers.

Note

It is normal to see one or more bulkGet requests if you are getting large amounts of data. Ex. a 24 port switch will take 2-3 request packets.

Performance

Make sure you have a low latency connection to your memcache from the Poller. Your memcache server can be replicated to all your poller hosts that should also run memcache instances. Check that memcached is running: netstat -a | grep memcached

Faulty Template

A bad snmp_template file was distributed in the genDevConfig sample-config directory, there were two glaring errors.

This was fixed on 2012-10-16. Make sure you update your template, or use the data from the wiki.

Note that the template should be called: SnmpBooster-template.cfg to make it easier to troubleshoot in the logs. So when you search for SnmpBooster in your logs it will show up as well.

Log files

All warnings and errors generated by the SnmpBooster module start with “[SnmpBooster] error text” and are logged using the standard Shinken logger.

The Arbiter daemon can output initial configuration, loading of host keys and intervals in memcached type error messages. The Scheduler daemon can output scheduling and alert related messages. The Poller daemon can output messages related to instance mapping, acquisition timeouts, invalid community strings, cache failures and more. These are available in the Web interface, as they are placed in the check results for the service.

You can simply do a grep SnmpBooster * in your shinken/var directory to see the latest messages related to the SnmpBooster module. You can also sort messages by timestamp to make it easy to find where and when errors occurred.

cd shinken/var
grep SnmpBooster *

memcache persistence

If you restart your memcache server or memcache crashes, your poller will no longer be able to validate that a host exists in memcache prior to writing.

You should use memcachedb to achieve persistence in case of memcache failures.

Common errors returned by SnmpBooster in the log file

Errors should be fairly explicit and mean what they say, though there can be exceptions. Lets try to clear some of them.

Arbiter log errors

Missing ds_oid

This means that a variable in your OID definitions is missing, or your DATASOURCE is not named correctly or your ds_oid variable is missing. There is a typo in your ds_oid variable (ex. ds-oid, or ds_oid = $OidNameIncorrectFFFRA.%(instance)s).

Datasource not defined

Your DSTEMPLATE uses a DATASOURCE that doesn’t exist check the [DataSourceName] you are referring to. Does it contain the expected OID variable, $OidName.

Missing ds_type

The DATASOURCE always needs to have a ds_type definition, GAUGE, COUNTER, DERIVE, TEXT, TIMETICK, DERIVE64, COUNTER64.

Poller log errors

Problems with calculations, repeated polling, hosts not responding, etc.

Memcached errors

memcachedb and memcached do not use the same default port. Configure the correct memcachedb port to match what is declared in your SnmpBooster module under shinken-specific.cfg.

On Ubuntu 12.04 the default installation is on port 21201 instead of 11211. This causes the error [SnmpBooster] Memcache server (127.0.0.1:11211) is not reachable when Shinken starts.

To change it, you must edit the file /etc/memcachedb.conf