Health diagnostics

BrickStor SP leverages health monitoring service actively checking system vitals using the probes. Probes represent the health of each component on the system. A component can have one or more probes representing various health aspects of the component.

For example, each storage Pool will have a probe for the status of the Pool as well as a probe for the capacity of the Pool. When a probe detects an issue, it will create alarms that can be viewed in the Health page. Each alarm has an associated severity. Currently, alarm severities include Warning, Error, and Critical.

The behavior of an alarm depends on the type of probe that generated the alarm. There are currently two types of probes. The first (and most common) type of probe is a sensor. Sensors generate alarms based on a measured value. Alarms caused by sensor probes will automatically clear themselves once the condition that caused the alarm has ceased.

The other type of probe is a Log probe. Log probes generate alarms based on values observed from a log file on the system. Unlike sensor probes, alarms generated by log probes do not resolve themselves. Instead, they must be explicitly acknowledged by the operator or muted.

To view the health of the system, navigate to the Health Diagnostics page.

Acknowledging alarms

Unlike sensor-based probes, log-based probes must be acknowledged to clear an alarm. To acknowledge an alarm, complete the following steps:

  1. Navigate to the Health Diagnostics page.

  2. Locate the probe corresponding to the alarm in the probe table.

  3. Click Acknowledge.

  4. Click on the gear (Gear) icon besides the component name.

  5. Click Acknowledge and Acknowledge again on the confirmation prompt.

Mute alarms

Muting a probe prevents webhooks from being invoked as well as prevents email alerts from being sent. To mute a probe, complete the following steps:

  1. Navigate to the Health Diagnostics page.

  2. Locate the probe to mute in the probe table.

  3. Click on the gear (Gear) icon besides the component name.

  4. Select Mute.

  5. Select the desired duration to mute the probe using Mute Until field. This can be forever (until explicitly unmuted) or for a duration of up to a year.

  6. By default, the probe will be unmuted if the severity of the probe changes. To disable this behavior, uncheck the Unmute if Severity Changes box.

  7. Click Mute.

Pruning a stale probe

When a probe stops reporting data, it becomes stale. This is a rare occurrence and normally should not be encountered. When this does occur, Prune # Stale will appear indicating # of stale probes. Clicking Prune will remove the stale probes.