All Collections
Network Monitoring
Aggregated events and alerts based on KPI thresholds
Aggregated events and alerts based on KPI thresholds
Team Celona avatar
Written by Team Celona
Updated over a week ago

Feature Overview

In a bid to elevate network monitoring capabilities, Orchestrator introduces a robust alerting feature aimed at measuring KPIs (Key Performance Indicators) and generating site-based events regularly. The primary focus is on providing customizable thresholds to alert users based on these events.

Key Alerts

  1. SITEAVAILABILITYALERT: Monitors site-level Access Point count and their health status

  2. RECURRINGPODCRASHALERT: Keeps an eye on edge-level pods, measuring the recurring pod crash count

Support Guide

New events are automatically generated by the backend event generator batch process and should start appearing in the Events section. These events carry severity levels of INFO or WARN based on their criticality.

Site Availability Based Events

Pod Crash Based Events

Threshold-based alerts need to be enabled to generate alarms for these new events from the Alerts section. Once SITEAVAILABILITYALERT and RECURRINGPODCRASHALERT are activated, the backend notification engine processes corresponding events that match the threshold. When the measured KPI meets the threshold, an alert is generated, appearing in the Alerts section and triggering notifications, such as on a Slack channel.

Default Thresholds

  1. SITEAVAILABILITYALERT: Triggers when AP availability drops by more than 10% in a given 5-minute time window.

  2. RECURRINGPODCRASHALERT: Activates when more than 3 pods crash within a minute.

Alerts View

The Alerts section on the Orchestrator will show any new alerts generated when the threshold is met (for example: if the site availability degrades more than 10% in a 5 minute time window), the corresponding SITEAVAILABILITYALERT alert will be generated:

Similarly, if there are more than 3 recurring pod crashes in any given minute, a new RECURRINGPODCRASHALERT alert will be generated:

Did this answer your question?