Understanding Alarm Levels
An alarm level (also called an alarm threshold, alarm limit, or alarm setpoint) is a predefined vibration value that, when exceeded, triggers an alert, notification, or automated action in a condition-monitoring system. Alarm levels define the boundary between acceptable and unacceptable operation, automatically flagging conditions that require investigation or intervention. They transform continuous streams of measurement data into actionable information by highlighting the exceptions that demand attention.
Setting alarm levels well is critical to a monitoring programme’s success: too sensitive and the system generates alarm fatigue from false alerts; too lenient and it misses real problems until they reach an advanced stage. Effective alarm levels balance early detection against practical response capability, drawing on equipment criticality, historical baseline data, and industry standards.
1. The Multi-Level Alarm Philosophy
Rather than a single pass/fail line, mature programmes use a tiered structure so that a rising trend is caught early and escalated progressively. A typical structure runs from a healthy normal range up to an automatic trip:
- Normal range: below the alert level the equipment is healthy and routine monitoring continues. Typically below 1.5–2× baseline, or below the ISO 20816 Zone B limit.
- Alert (caution): around 2–3× baseline, or entering ISO Zone C. The condition is degrading and the cause should be investigated — increase monitoring frequency, plan an inspection, and establish the trend. Timeline: maintenance within weeks to months.
- Alarm (warning): around 4–6× baseline, or upper Zone C. A significant problem needing urgent attention — schedule maintenance soon (days to weeks), perform detailed diagnosis, and monitor daily. Timeline: repair within 1–4 weeks. This mid-tier is often labelled a warning level.
- Danger (critical): around 8–10× baseline, or entering ISO Zone D. A severe condition with imminent failure risk — plan an immediate shutdown and repair. Timeline: days, with continuous monitoring until repaired.
- Trip (shutdown): catastrophic failure imminent; the equipment must be stopped to prevent damage. Implemented via online monitoring with automatic shutdown capability — the protective function of a trip level.
2. Alarm-Setting Methods
There are four widely used ways to arrive at a numeric limit, each with its own strengths.
Baseline-referenced alarms
Machine-specific limits derived from historical data — for example alert at 2× baseline, alarm at 4× baseline, danger at 8× baseline. The advantage is that they are customised to each machine’s normal operation; the requirement is good baseline data captured when the machine was known to be healthy.
Standards-based alarms
Limits taken from ISO 20816 or other industry standards, where zone boundaries define the alarm levels according to machine type and size. The advantage is that they are standardised and widely accepted; the limitation is that they may not match a specific machine’s individual characteristics. You can locate a machine against the relevant zones with an ISO 20816 vibration-zone tool.
Statistical alarms
Limits based on the mean and standard deviation of historical data — alert at mean + 2σ, alarm at mean + 3σ. The advantage is that they adapt to each machine’s natural variability; the requirement, again, is sufficient historical data to make the statistics meaningful.
Component-specific alarms
Separate limits for different spectrum components — a 1× alarm for unbalance, dedicated bearing-frequency alarms, and gear-mesh alarms. The advantage is specific fault detection: a band alarm reacts to the fault it is tuned for long before the overall level moves.
3. Alarm-Response Procedures
An alarm level is only useful if a defined response follows it. Each tier carries its own actions:
- Alert level: review the trend to confirm it is not a false alarm, increase monitoring frequency, check for recent maintenance or operating changes, plan a more detailed analysis, and continue operation under closer watch.
- Alarm level: carry out detailed analysis (FFT and envelope analysis), identify the specific fault, raise a work order, schedule maintenance within 1–4 weeks, and monitor daily or continuously until repair.
- Danger / trip level: make an immediate engineering assessment, plan a rapid shutdown and repair, prepare spare parts and resources, judge whether continued operation is safe, and execute the repair at the first opportunity.
4. Common Alarm-Setting Mistakes
Three failure modes recur across monitoring programmes:
- Too sensitive: frequent false alarms, alarm fatigue (operators begin to ignore alarms), wasted investigation time, and loss of credibility for the whole programme.
- Too lenient: problems reach advanced stages before detection, lead time for planning shrinks, repair costs rise, and the risk of in-service failure grows.
- One-size-fits-all: the same alarm applied to every equipment type ignores real differences between machines, producing either too many false alarms or missed problems. Machine-specific alarms are strongly preferred.
5. Optimisation and Tuning
Alarm levels are not set once and forgotten — they are refined as experience accumulates. Initial settings should be conservative (tighter), based on standards or baseline × factors, with the false-alarm rate watched closely and adjusted as familiarity grows. Refinement means tracking alarm performance (true versus false), adjusting limits to push the false-alarm rate down, and documenting every change and its rationale; a practical target is fewer than 5–10% false alarms. Continuous improvement closes the loop: learn from missed failures (alarms set too lenient) and from false alarms (set too sensitive), incorporate new data and experience, and review alarm levels periodically — typically once a year.
6. Alarm Levels in Field Practice
For machines outside a permanent online system, alarm levels are applied during route-based or ad-hoc field measurement. A portable two-channel analyser such as the Балансет-1А lets an engineer capture amplitude, phase and a full spectrum on the spot and compare each reading against the chosen ISO 20816 zone or baseline multiple — turning a quick site visit into a clear go/no-go decision. When a 1× alarm flags rising unbalance, the same instrument is used to correct it by field balancing without removing the rotor. In short, alarm levels are the decision boundaries that convert condition-monitoring measurements into action: set well — balancing sensitivity against specificity, matched to equipment criticality and deterioration rate, and continuously refined — they catch real problems early while keeping false alerts to a minimum.