Understanding Thresholds in Condition Monitoring
ఎ threshold — also called a limit, setpoint, or trigger value — is a predefined level that separates normal from abnormal behaviour in a condition monitoring system. When a measured parameter such as vibration, temperature, or pressure crosses its threshold, it triggers an action: an alarm notification, an automatic data capture, a work-order, or in the most serious case an equipment shutdown. Thresholds are the decision boundaries that turn a continuous stream of measurement data into discrete, actionable events, letting an automated system flag the few exceptions that genuinely need human attention.
Choosing those boundaries well is fundamental to a monitoring programme’s success. Every threshold is a compromise between sensitivity (catching problems early) and specificity (not crying wolf), and it should reflect the equipment’s criticality, the failure modes you expect, and how much operational risk the site is willing to carry.
1. Types of Threshold
Thresholds differ in what they compare a reading against. Four families cover almost every monitoring scheme, and mature programmes often run several in parallel on the same machine.
Absolute thresholds
- Fixed values in engineering units (mm/s, °C, bar) — for example, alarm if velocity exceeds 7.1 mm/s.
- Drawn from standards such as ISO 20816 (the modern successor to ISO 10816), equipment specifications, or accumulated experience.
- The same limit applies regardless of the machine’s individual history — simple to understand, document and defend.
Relative thresholds
- Defined as a multiple of a baseline or reference reading — for example, alarm if vibration exceeds 3× baseline.
- Adapts to each machine’s own healthy signature, so it is more sensitive to genuine change.
- Only as good as the baseline behind it, so it depends on sound baseline data.
Rate-of-change thresholds
- Watch how fast a value is moving rather than its level — for example, alarm if vibration rises more than 50% in a week.
- Catch accelerating deterioration early and are independent of the absolute reading, complementing classic trend analysis.
Statistical thresholds
- Derived from the statistics of historical data — for example, alarm if a value exceeds the mean plus three standard deviations.
- Account for normal scatter and adapt to process variation, but need enough history to be trustworthy.
2. Approaches to Setting Thresholds
Where the numbers themselves come from is a separate question from the type. Three approaches are common, and they are frequently blended.
Standards-based
- Uses published zone boundaries — ISO 20816 vibration zones, or industry codes such as API and NEMA.
- Advantages: proven, documented and easy to justify to an auditor.
- Limitations: generic by design; a broad standard cannot fit every machine and mounting.
Experience-based
- Built from a site’s own record of failures and good runs, refined over years.
- Advantages: genuinely specific to the equipment and its duty.
- Limitations: takes time and expertise to accumulate.
Risk-based
- Sets the limit according to the consequence of failure: tighter thresholds on high-consequence assets, looser ones where a stoppage is cheap.
- Aligns the monitoring effort with critical machinery priorities and optimises total programme cost against risk.
3. The Alarm Hierarchy
A single threshold rarely suffices. Most systems stack several so that a developing fault is met with a graduated response rather than a single binary trip:
- A first warning level flags an early, watch-this deviation.
- An alarm level calls for planned intervention before the next opportunity is lost.
- ఎ trip level drives an automatic shutdown to prevent catastrophic damage.
This layering buys decision time: the gap between the first warning and the trip is the lead time the maintenance team has to act, which is the whole point of early-warning monitoring.
4. Common Pitfalls
Too tight (over-sensitive)
- Result: a flood of false alarms.
- Effect: alarm fatigue and wasted investigation time — and the real danger that a genuine alarm is ignored in the noise.
- Fix: relax the limit guided by the measured false-alarm rate.
Too loose (over-lenient)
- Result: problems caught late.
- Effect: shorter lead time, higher repair cost, sometimes failure before any alarm at all.
- Fix: tighten the limit and increase monitoring frequency.
One-size-fits-all
- Applying the same limit to dissimilar machines ignores their differences — it is simultaneously too tight for some and too loose for others. Equipment-specific thresholds are almost always preferable.
5. Optimisation, Validation and Documentation
A threshold is not set once and forgotten; it is tuned over the life of the programme.
- Initial setting: start from a standard or a conservative estimate, record the rationale, and plan to refine it.
- Tuning: count true versus false alarms, aiming for under ~10% false alarms while still catching over ~90% of real problems; tighten if you are missing faults, loosen if you are drowning in nuisance trips, and document every change.
- Validation: after each real failure, ask whether the threshold gave adequate warning and whether any false alarms wasted resources, then adjust against those outcomes.
- Governance: keep a threshold database holding current values, change history and rationale for every machine, under a formal change-control process with engineering review and communication to operations.
Thresholds can be applied to more than the overall level, too: dedicated limits on specific spectral lines such as bearing fault frequencies or 1× and 2× running speed give earlier, more specific fault detection, and limits on derived metrics such as crest factor and kurtosis can flag impacting bearing damage long before the broadband level moves.
6. Thresholds in Field Balancing
The threshold concept reaches beyond surveillance into corrective work. When a rotor is balanced, the acceptance limit is itself a threshold: the job is finished only when the measured residual unbalance — or the resulting vibration — drops below the chosen tolerance. A portable two-channel analyser such as the Balanset-1A measures the 1× amplitude before and after correction and confirms the final reading sits inside the target balancing tolerance band, which is exactly a pass/fail threshold applied to a single repair rather than to continuous monitoring.