Understanding Fault Detection

Vibration sensor

Optical Sensor (Laser Tachometer)

Balanset-4

Magnetic Stand Insize-60-kgf

Reflective tape

Dynamic balancer “Balanset-1A” OEM

Fault detection is the process of recognising that a defect or abnormal condition exists in a machine by analysing monitored parameters — most commonly vibration, but also temperature, performance metrics, oil debris, or acoustic signals. It answers a single binary question — “Is there a problem?” — before the analyst moves on to diagnosis (identifying the specific fault) and prognosis (predicting how long the machine has left). As the first and most fundamental step in condition-based maintenance, it cleanly separates healthy, deteriorating, and failed states so that everything downstream has a reason to happen.

The value of doing this well is lead time. Effective fault detection raises a flag months before functional failure, creating the window needed for planned maintenance, parts procurement, and scheduled downtime — the core promise of any predictive maintenance programme. Detect too late and you are back to reactive, run-to-failure repairs; detect too eagerly and you drown in false alarms. The art lies in striking that balance, and the sections below break down how it is done in practice.

1. The Five Core Detection Methods

No single technique suits every machine or fault. Mature programmes layer several methods, each with its own strengths and data requirements.

Threshold Exceedance

The simplest and most widely used approach compares a measurement against a predefined threshold: if the value crosses the line, a fault is declared. A classic rule of thumb is an overall vibration level above roughly 7.1 mm/s RMS triggering an alert, a boundary drawn directly from severity charts such as ISO 20816-1 (the modern successor to ISO 10816). Many programmes stage these limits into alarm, upozorenje, and trip tiers.

  • Advantages: simple to automate, clear pass/fail criteria, easy to communicate to non-specialists.
  • Limitations: the threshold must be set correctly, and the fault must grow large enough to cross it — which introduces lag and can miss faults that produce small but distinctive signatures.

Trend Deviation

Rather than waiting for an absolute limit, this method watches the shape of the trend itself. A steadily rising level — or, more tellingly, a sudden change in the rate of rise — signals a developing fault long before any fixed threshold is reached. Because the reference is the machine’s own history, the technique is inherently machine-specific and catches problems early. Its only real requirement is a body of baseline data against which deviation can be judged.

Spectral Anomaly Detection

Examining the frequency spectrum reveals not just that something changed but what. New peaks at bearing fault frequencies, existing peaks growing in amplitude, or the emergence of sidebands and harmonics each point to a specific fault type. This specificity is its great advantage, though it demands genuine spectral-analysis capability and a set of trustworthy baseline spectra for comparison.

Statistical Methods

Statistical detection flags values that fall outside the normal distribution of healthy operation — for example, any reading beyond the mean plus three standard deviations, or a violation of control-chart limits. By accounting for the inherent scatter in real measurements, these methods reduce nuisance alarms, but they need an adequate sample size to characterise “normal” reliably.

Pattern Recognition and Machine Learning

The most sophisticated tier trains algorithms — including neural networks — on healthy versus faulty signatures, allowing automated detection of subtle patterns that fixed rules miss. The trade-off is the need for substantial labelled training data and the computational resources to run the models.

2. Measuring Detection Performance

A detection system is only as good as its hit rate and its false-alarm rate. Four metrics, borrowed from classification theory, quantify how well it performs.

  • Sensitivity (true-positive rate): the fraction of real faults actually caught — True Positives / (True Positives + False Negatives). A well-tuned programme targets above 90–95%; higher sensitivity means fewer missed faults.
  • Specificity (true-negative rate): the fraction of healthy machines correctly cleared — True Negatives / (True Negatives + False Positives). Again, 90–95% is the goal; higher specificity means fewer false alarms.
  • False-alarm rate: the share of alerts that turn out to be nothing, ideally held below 5–10%. A high rate breeds alarm fatigue, the slow erosion of trust that leads technicians to ignore warnings — and it trades off directly against sensitivity.
  • Detection lead time: the interval between first detection and functional failure. Longer is better, because it buys planning time. For bearing faults caught by vibration the typical lead is weeks to months, and method matters: envelope analysis routinely detects incipient bearing damage far earlier than overall-level monitoring alone.

3. Practical Challenges

Real machines rarely behave as neatly as a textbook, and three situations regularly complicate detection.

  • Early-versus-false-detection balance: pushing for the earliest possible warning inevitably raises false alarms, while waiting for an unmistakable signal sacrifices lead time. The usual remedy is multi-stage alarming and corroboration across several parameters before committing to a call.
  • Intermittent faults: problems that appear and vanish may sit below threshold during a periodic route measurement. Catching them calls for continuous monitoring or a peak-hold capture that preserves the worst moment.
  • Multiple simultaneous faults: when several defects develop at once they can mask one another in the vibration signal, so a comprehensive, multi-method analysis is needed to untangle them.

4. Confirming Faults with Multiple Parameters

Cross-checking two or more independent indicators dramatically reduces false detections, because a genuine fault tends to show up in more than one place at once.

  • Vibration and temperature together: both rising confirms a bearing problem; vibration alone points to a mechanical cause such as unbalance ili misalignment; temperature alone suggests a lubrication or friction issue.
  • Multiple vibration parameters: an overall-level increase combined with the emergence of a specific bearing frequency confirms a bearing fault far more confidently than either symptom on its own.

5. Automated, Manual, and Hybrid Detection

Detection can be done by software, by an expert, or — best of all — by both working together.

  • Automated detection is fast, consistent, and works around the clock, using threshold checks, statistical algorithms, and machine learning. Its weakness is that it can overlook subtle problems and occasionally fire on noise.
  • Manual (expert) detection brings human judgement, context awareness, and trained pattern recognition to spectrum review and waveform inspection. It is, however, time-consuming, hard to scale, and dependent on scarce expertise — the kind certified under ISO 18436-2.
  • The hybrid approach — automated screening of the whole fleet with expert review of the flagged exceptions — combines efficiency with accuracy and is the standard in mature programmes.

Where field instruments fit

Once a screening tool raises a flag, the next move is usually to take a richer measurement at the machine. A portable two-channel analyser such as the Balanset-1A lets a technician walk up to the suspect machine, capture a high-resolution spectrum and time waveform, and confirm whether the alarm reflects a real defect — and if that defect is unbalance, correct it on the spot through field balancing without dismantling the machine. That tight loop from detection to confirmation to correction is exactly what a hybrid programme is built to deliver.

Fault detection is the foundational capability that makes predictive maintenance possible, surfacing developing problems early enough to plan around them. Done well — with the right mix of detection methods, carefully set thresholds, and a deliberate balance between sensitivity and specificity — it delivers the early warnings that keep equipment running while holding down both maintenance cost and the risk of catastrophic failure.


← Back to Main Index

WhatsApp
Balanset-1A · €1975 Ask engineer