Health Checks

Introduction

Identity Panel includes a Health Check framework for creating dynamic health monitoring of identity systems. Health checks may be displayed on the dashboard, and can be used to trigger workflows.

Be default, health checks are evaluated every 60 seconds. Every health check result is saved for 24 hours, and the first health check of each hour is saved indefinitely. This makes it possible to use the reporting engine to create trend and summary reports of health check metrics.

A health check consists of a named collection of health probes. Each probe tests a single aspect of system's health and has Rule Engine logic to indicate whether the probe passed or failed.

Depending on which probes are used, it may be necessary to grant permissions to the SoftwareIDM Panel Service service account(s).

Health Checks

Settings

Health checks may be created and edited by adding a Health Check provider in the Provider Settings interface.

It is possible to adjust the health check interval, and disable groups of health checks for when systems will be undergoing maintenance.

Health Settings

Health checks may be configured with a list of preferred servers to use to execute the check. Each health probe must be given a unique Name, and most health probes should have a Fail Rule which should return true if the health probe fails.

Probe Types

Identity Panel includes a range of probe types, and it is possible to create new health probe types by adding dll references to config.json.

Note: for any probe type, if an error occurs while executing the probe, then the Probe Result will have Error and StackTrace properties set, and the probe will be considered to have failed.

For additional health probes see Provider documentation

SQL Server Probes

Panel Platform includes four basic probes for monitoring database health.

SQL Available

Indicates whether a particular database could be attached successfully.

Probe Result

  • Value: true|false
  • StringValue: "Available| or "Connect Failed"

Settings

  • Instance Name
  • Database
  • User (opt)
  • Password (opt — may be overridden by Panel:Password appSetting)
  • Fail Rule — Default is Not(context.Value) , which returns true if the probe was unable to connect.

SQL Size

Measures the database size or transaction log size.

Probe Result

  • Value: integer number of KBytes
  • StringValue: "Failed" or formatted size e.g. "20.5 MB", "2.1 GB"

Settings

  • Instance Name
  • Database
  • Check: "Database" or "Log"
  • User (opt)
  • Password (opt)
  • Fail Rule — SQL Size does not specify a default fail rule, since it is impossible to anticipate the expected size.

SQL Backup

Measure how much time has elapsed since the last database backup.

Probe Result

  • Value: TimeSpan time elapsed since backup
  • StringValue: Formatted time, e.g. 3 days, 4 hours

Settings

  • Instance Name
  • Database
  • User (opt)
  • Password (opt — may be overridden by Panel:Password appSetting)
  • Fail Rule — Default rule is context.Value > Days(7) which returns true if more than a week has elapsed since the database was backed up.

SQL Disk Space

Measure how much free space is remaining on each drive attached to the SQL server.

Probe Result

  • Value: null
  • Disks: Dictionary<string, int> free space in MBytes { "C": n, "D": n ... }
  • StringValue: HTML un-ordered list showing free disk space

Settings

  • Instance Name
  • Database
  • User (opt)
  • Password (opt — may be overridden by Panel:Password appSetting)
  • Fail Rule — Default rule is MapOr(context.Disks, \"context.Value <= 1000\") which returns true if any disk has less than 1 GB remaining.

File Probes

Panel platform includes probes for verifying updates to files, such as a regularly updated data-feed consumed by the Sync Engine.

File Time Stamp

Verify that a file has been modified within a certain time range.

Probe Result

  • Value: TimeSpan, time elapsed since LastWriteTimeUtc or CreationTimeUtc
  • StringValue: Formatted time string e.g. "2 hours, 20 minutes"

Settings

  • File Name: Path to file to verify
  • Time Stamp: "Created" or "Modified"
  • Fail Rule: Default rule is context.Value > Hours(24) which returns true if more than a day has elapsed.

File Exists

Verify that a file exists at the specified path.

Probe Result

  • Value: true|false
  • StringValue: "Exists" or "Does not Exist"

Settings

  • File Name: Path to file to verify
  • Fail Rule: Default rule is Not(context.Value) which returns true if the file does not exist.

File Size

Verify that a file is in the expected size range and has not been truncated.

Probe Result

  • Value: integer, size of file in bytes
  • StringValue: Formatted size string e.g. "2.1 MB"

Settings

  • File Name: Path to file to verify.
  • Fail Rule: Default rule is context.Value < 1024 which returns true if the file is less than 1 KB in size.

Network Probes

Panel platform includes probes for verifying that web pages are available and functioning as expected.

HTTP Request

Performs an HTTP GET request to a target URL and checks the HTTP Response status code.

Probe Result

  • Value: integer, numeric status code
  • StringValue: Value of status code e.g. "200"

Settings

  • Url
  • User (opt)
  • Password (opt — may be overridden by Panel:Password appSetting)
  • Fail Rule — Default rule is (context.Value >= 400) && (context.Value < 200) which returns true if the response is outside the Ok/Found or Redirect range.

Resolve DNS

Resolves a DNS name and verifies it maps to an IP address.

Probe Result

  • Value: List<string> of IP addresses returned
  • StringValue: HTML unordered list of IP Addresses, or "Not Found"

Settings

  • DNS Name
  • Fail Rule — Default rule is Not(context.Value.Count) which returns true if the host could not be resolved.

EWS Available

Performs an HTTP GET request /ews/exchange.asmx to verify that Exchange Web Service is available.

Probe Result

  • Value: true|false
  • StringValue: "Available" or "Connect Failed"

Settings

  • Mail Host — should be the FQDN of the mail host
  • Use SSL — Whether to use SSL to create the query to the Exchange Web Service
  • User
  • Password (opt — may be overridden by Panel:Password appSetting)
  • Fail Rule — Default rule is Not(context.Value) which returns true if the web service is not available.

Miscellaneous Probes

Performance Counter

Checks a Windows Performance counter, and applies a rule to the result.

The Performance Counter display has instructions for using the MMC Performance Monitoring snap-in to look up the appropriate Counter, Category, and Instance names.

Probe Result

  • Value: float value returned by PerformanceCounter.NextValue()
  • StringValue: Value.ToString()

Settings

  • Server: Host to check performance counters on
  • Counter Name
  • Category Name
  • Instance Name
  • Fail Rule — No default rule is set.

Service Status

Checks the run status of a Windows service.

Probe Result

  • Value: integer status code
  • StringValue: ServiceControllerStatus enum string: e.g. "Running", "Stopped", "Paused"

Settings

  • Server
  • Service Name
  • Fail Rule — Default rule is context.StringValue != "Running" which returns true if the service is in any state other than running.

Searches recent Identity Panel history to assert against time ranges. e.g. Can be used to make sure a particular schedule or MA has executed in a time window.

Probe Result

  • Value: value produced by Object Rule
  • StringValue: Value.ToString()

Settings The first part of the settings constructs a history REST API query.

  • Provider
  • History
  • Argument
  • Result

The second part gives a time range to search within (e.g. 3 hours, 1 day). The results of the query get passed to Object Rule which converts from a list of history objects to a single value.

  • Object Rule – Default is Count which is the length of the list
  • Fail Rule – Default is Value == 0

Copyright © SoftwareIDM

Table of Contents