Monitoring Parameters

Monitoring Parameters

IPHost Network Monitor provides constant monitoring of network services and resources that are critically important for your company. Unlike many other monitoring tools, IPHost Network Monitor not only checks resource availability but also checks its operability and performance characteristics.

The base element in IPHost Network Monitor is a monitor; it checks resource availability and requests the value of a certain parameter. A monitor has parameters defining:

  1. What and how should be checked (monitoring parameters),
  2. Acceptable and problem levels of the collected performance data (intended to define monitor state change conditions),
  3. Alerts to be triggered when a problem appears (alerting rule),
  4. Additional parameters such as a link to the basic report on the monitor, dependency on another monitor or agent availability, and comments.

Parameters Inheritance

There is a convenient way to manage system settings via parameter inheritance. By default, a monitor inherits most of its parameters from its direct parent (host or application), that inherit parameters from their parents (hosts or host groups), then remote network agents and, finally, the root of a Main view tree (the “All Agents” node). For example, by default a monitor inherits credentials from its parent host. Accordingly, if you change credentials for the host, the change will be applied to all of its monitors automatically. You can always break the inheritance and define custom parameters for any monitor (or any other tree node). Besides, protocol-specific parameters such as port numbers and URLs are not inherited. Instead, they are copied from monitor type parameters when a new monitor of this type is created.

Polling

A polling interval for any active monitor can be specified separately. By default the polling interval is set to 1 minute. It is strongly recommended to set a polling interval to longer than 15 seconds because more frequent checks are practically useless, besides, they can result in network or monitored resource performance degradation, which contradicts the idea of real-time monitoring.

If checking has succeeded then a value is obtained as a result (it can be response time or a file size, or free disk space, etc.). This value needs to be checked against the monitor state condition sections. Some conditions may be violated, and then the monitor will enter Down or Warning state, depending on the type of a state condition section that failed. A timeout during the poll is also controlled by a state condition section, so monitors for unavailable resources enter Down state in accordance with it.

Most state condition section types contain the Spike filter parameter. When this parameter is enabled, and a check fails, several additional polls are performed to confirm the problem (to filter out random short-time spikes in performance). The number of these additional polls is set in the Spike counter parameter. If all these polls fail as well then the section is applied and the monitor changes its state.

If a monitor encounters an unrecoverable error during the poll, such as an authentication error, it will immediately enter Down state and produce an error message shown in the Logs Pane. In this case the monitor cannot determine its value, so the State condition sections cannot be used.

If a monitor changes its state to problem state, IPHost Network Monitor executes alert(s) assigned to this state change via an alerting rule. The alert and alerting system structure is described here.

The OK state means that poll is successful and returned value is acceptable.

Credentials

Certain resources require user to authenticate and authorize in order to access. Hence, the monitoring service should be able to provide credentials to monitor these resources. It is possible either to assign a specific named credentials set to a given monitor or use the inherited set.

Passive Monitors

Unlike active monitors, passive ones don’t poll devices periodically but are called when something happens on a remote device. In response, the monitor can either change its state or trigger an event without changing the state. Alerts are assigned to events as well as to monitor state transitions via alerting rules, so an alert can be executed for any event that has occurred.

If there are no new events within a specified time interval the Response Timeout Down State condition section can be used for passive monitors to automatically switch them to Down state.

A passive monitor example is SNMP Generic Trap described here.

Multilevel Checks

Most monitor types allow to perform multilevel checks. For example, the basic check for HTTP(S) type monitors is a call to a certain URL (GET or POST with optional parameters) while an additional check (second level) is a validation of a received page by means of checking for the presence or absence of a specified string on it. An example of a monitor with three levels of checking is an SMTP monitor:

  1. Creating connection with an SMTP server;
  2. User logging in;
  3. Sending a test message.

Monitor States

At any time moment, a monitor can be in one of eight states. There are three states in which monitoring is performed, namely:

  1. OK
  2. Warning
  3. Down

There are five states, in which monitoring is not performed (no periodic checks are performed):

  1. Discovered (a monitor has been found in the discovery process and is waiting for review)
  2. Stopped (monitoring has been terminated by the user)
  3. Stopped by dependency (the monitor, on which this monitor depends, is in the state, that was selected as stop state for this monitor, or Stopped by dependency)
  4. Unknown (means that the monitoring service is stopped or down)
  5. Maintenance (the user can select whether monitoring should be continued during maintenance or not)

You can start/stop monitoring of any monitor/host or an entire group as well as perform a check at any moment using the Monitor Control toolbar:

The keyboard shortcuts are: Ctrl+1/Ctrl+2/Ctrl+3 (Start/Stop/Poll now).

Dependencies

The Dependency settings section on a monitor Main parameters tab allows you to force this monitor state be dependent on another monitor state or remote agent availability.

  • When Stop if selected monitor is in Down state option is selected, a current monitor is checked only if the monitor on which it depends is in a state different from Down or Stopped by Dependency. This facility can be of especial use to establish, for instance, the dependence of the monitors on particular hosts on the state of the router through which they are connected to the network, and to prevent false alerts on monitor state changes when this router goes down.
  • When Stop if selected monitor is in Warning state option is selected, a current monitor is checked only if the monitor on which it depends is in a state different from Warning or Stopped by Dependency.
  • When Stop if selected monitor is in Ok state option is selected, a current monitor is checked only if the monitor on which it depends is in a state different from Ok or Stopped by Dependency.
  • When Stop if selected monitor is not in Down state option is selected, a current monitor is checked only if the monitor on which it depends is in a Down or Stopped by Dependency state. This mode can be used, say, to check the availability of alternative network connection only when the primary connection becomes unavailable.
  • When Stop if selected monitor is not in Warning state option is selected, a current monitor is checked only if the monitor on which it depends is in a Warning state.
  • When Stop if selected monitor is not in Ok state option is selected, a current monitor is checked only if the monitor on which it depends is in a Ok state.
  • When Stop if connection with remote agent cannot be established option is selected, current monitor is checked only if the remote agent containing this monitor is online. This helps to suppress alerting flood when connection to some agent is lost. This option is visible only for monitors on remote agents.

It is possible to indicate a PING monitor on the same host or any specific monitor on any host as a dependency using the Dependency editor dialog that opens once you click the Select… button.

By default, a monitor depends on the PING monitor on its parent host and agent link availability (for monitors on remote agents).

Available Monitor Types

Monitor Type Functionality
PING Sends a standard PING to the host
TCP Checks whether the host accepts connection at a specified port number
UDP UDP datagram send/receive on a specific port
SMTP Checks an SMTP server with optional authentication and can send a test message
POP3 Checks a POP3 server with optional authentication
IMAP Checks an IMAP server with optional authentication
Mail route Sends an e-mail message to given address, checks that it has been delivered and then deletes that message
HTTP(S) GET or POST HTTP/HTTPS request with optional response code and content validation
FTP Checks an FTP server with optional authentication
Web Transaction Monitor Checks a certain Web application (makes a sequence of related HTTP(S) requests forming a complete transaction) with optional content validation
DNS Checks a DNS server functionality
SNMP Custom Monitors the SNMP v1 / v2c / v3 performance counters such as network traffic or system resources on SNMP-enabled devices such as routers
SNMP Generic Trap Listens for the SNMP v1 / v2c / v3 traps sent by SNMP-enabled devices such as routers
SNMP CPU Measures CPU usage parameters (total, user time, system time and other parameters) using data provided by SNMP agent on target host.
SNMP Memory Measures memory usage parameters (free, used for either physical memory or swap space) using data provided by SNMP agent on target host.
SNMP Disk space Measures free or used disk space for a specified filesystem using data provided by SNMP agent on target host.
SNMP Process Shows various parameters (number of processes, CPU and memory usage) of specified process using data provided by SNMP agent on target host.
SSH CPU Measures CPU usage parameters using data provided by SSH script running on target host.
SSH Memory Measures memory usage parameters using data provided by SSH script running on target host.
SSH Disk space Measures free or used disk space for a specified filesystem using data provided by SSH script running on target host.
SSH Process Shows various parameters of specified process using data provided by SSH script running on target host.
Syslog Listens to syslog messages (RFC 3164 and RFC 5424 are supported) sent by applications or devices.
Disk space Monitors the free disk space of a local disk drive or a remote network share
File Monitors a file on a local disk drive or a network share. It checks if the file exists and if the file size is in a given range
Windows Service Monitors the presence of any Windows service on local or remote computer. You can restart the service using the Run program alert
WMI Query (run WMI script) Runs a custom WMI script to check some value on local or remote computer
WMI Bytes Received/sec Checks the current inbound throughput on local or remote computer
WMI Bytes Send/sec Checks the current outbound throughput on local or remote computer
WMI Uptime Shows target host uptime in days according to data provided by WMI service.
WMI CPU Measures CPU usage parameters using data provided by WMI service on target host.
WMI Memory Measures memory usage parameters using data provided by WMI service on target host.
WMI Disk space Measures free or used disk space for a specified filesystem using data provided by WMI service on target host. Note: Unlike generic Disk space monitor this monitor does not require the monitored filesystem to be a network shared resource.
WMI Process Shows various parameters (number of processes, CPU and memory usage) of specified process using data provided by WMI service on target host.
Windows Event Log Calculates number of events added to the specified event log channel for specified timeframe using Windows API. Monitor is available for operating systems starting from Windows Vista and Windows Server 2008. Event Log monitoring for Windows Server 2003 is unsupported.
MS SQL Database Checks an MS SQL database for availability with optional authentication and SQL expression execution
MySQL Database Checks a MySQL database for availability with optional authentication and SQL expression execution
ODBC Database Checks an ODBC data source for availability with optional authentication and SQL expression execution. You can use it to monitor MS Access, PostgreSQL, SQLite, Firebird, and other databases if they have ODBC drivers implemented
Oracle Database Checks an Oracle database for availability with optional authentication and SQL expression execution
Python script Allows to run a Python script and to interpret the script return code and output as a performance value.
Script or Program Makes it easy to create your own custom monitors. Sample monitors for directory size, file content, and HTTP response content are provided
SSH (Remote Script or Program) Allows running commands on other computers over SSH and integrating IPHost with other systems deployed remotely.
WMI Traffic Speed Calculates incoming, outgoing or total traffic speed (average for a polling interval) on the specified network interface using data provided by the target host WMI service
WMI Traffic Volume Calculates incoming, outgoing or total traffic volume on the specified network interface for specified timeframe using data provided by the target host WMI service
SNMP Traffic Speed Calculates incoming, outgoing or total traffic speed (average for a polling interval) on the specified network interface using data provided by the target host SNMP service
SNMP Traffic Volume Calculates incoming, outgoing or total traffic volume on the specified network interface for specified timeframe using data provided by the target host SNMP service
Hyper-V Host Measures CPU usage by Hyper-V guest operating systems and Hyper-V hypervisor, deposited pages, and can show total number of virtual machines in critical states
Hyper-V Virtual Machine Measures Hyper-V Virtual Machine performance data using data provided by WMI service on target host
Hyper-V Virtual Storage Measures error count and read and write speed on a Hyper-V virtual storage device using data provided by WMI service on the target host
Hyper-V Network Traffic Measures total, incoming or outgoing traffic speed or volume on the specified Hyper-V virtual network device
VMware Host Check the main characteristics of hypervisor: CPU, memory, disk, network usage.
VMware Virtual Machine Check the main characteristics of virtual machine: CPU, memory, disk, network, disk space usage.
VMware Datastore Check the main characteristics of the datastore: disk space, read/write latency.