Nagios appears to have a couple annoying quirks regarding DNS though. It seems to be rather insistent on quering IP addresses for low level services, rather than doing more of a system-level monitoring.
- check_http appears to resolve the hostname into an IP address before checking it, unless the -H option is used. This breaks google app engine and anything else that uses a virtual hosts-like mechanism to determine what page to serve.
- check_smtp does not seem to do an MX lookup for a host. Instead, it resolves to our web server and tries to open the web server for SMTP.
- mirrorrr has a 1 hour cache by default. It should be minimized or disabled when used for monitoring.
- TBD: The local mirrorrr install should probably get an IP range filter added to it so that it is more difficult to DOS.
- nagios isn't too happy about passing messages around machines. My main options appear to involving choosing the least of three evils:
- adding a private key to the nagios machine and sshing everywhere
- installing NPRE as a daemon on every machine and querying them (it's not that lightweight and most of the servers need more memory badly)
- calling ncsa_send from some cron shell scripts to a relatively insecure mechanism on the nagios machine. I opted for ncsa_send, but the port is only visible to the intranet, and there are a couple of machines in a dmz that can't reach it easily.