Understanding Fault Detection
Definition: What is Fault Detection?
Fault detection is the process of identifying that a defect or abnormal condition exists in equipment through analysis of monitored parameters such as vibration, temperature, performance metrics, or other indicators. Fault detection answers the binary question “Is there a problem?” before proceeding to fault diagnosis (identifying the specific problem) and prognosis (predicting remaining life). It is the first and most fundamental step in condition-based maintenance, distinguishing normal operation from deteriorating or faulty conditions.
Effective fault detection provides early warning—detecting problems months before functional failure—enabling the lead time necessary for planned maintenance, parts procurement, and scheduled downtime that are the core value propositions of predictive maintenance programs.
Detection Methods
1. Threshold Exceedance
Simplest and most common:
- Compare measurement to predefined threshold
- If measurement > threshold → fault detected
- Example: Overall vibration > 7.1 mm/s triggers alert
- Advantages: Simple, automated, clear criteria
- Limitations: Requires proper threshold setting, lag time to exceed threshold
2. Trend Deviation
Detects changes from normal pattern:
- Increasing trend indicates developing fault
- Detect before absolute threshold exceeded
- Rate of change alarming (rapid increases)
- Advantages: Earlier detection, machine-specific
- Requirements: Historical trending data needed
3. Spectral Anomaly Detection
Identifying abnormal frequency components:
- New peaks appearing in spectrum (bearing frequencies, harmonics)
- Existing peaks growing in amplitude
- Pattern changes (sidebands developing)
- Advantages: Specific fault type indication
- Requirements: Spectral analysis capability, baseline spectra
4. Statistical Methods
- Values outside normal statistical distribution
- Outlier detection (> mean + 3σ)
- Control chart violations
- Advantages: Accounts for normal variability
- Requirements: Adequate statistical sample size
5. Pattern Recognition
- Machine learning algorithms
- Neural networks trained on normal vs. faulty signatures
- Automated anomaly detection
- Advantages: Can detect subtle patterns
- Requirements: Training data, computational resources
Detection Performance Metrics
Sensitivity (True Positive Rate)
- Percentage of actual faults detected
- Target: > 90-95% of real problems detected
- Higher sensitivity = fewer missed faults
- Measure: (True Positives) / (True Positives + False Negatives)
Specificity (True Negative Rate)
- Percentage of healthy equipment correctly identified as healthy
- Target: > 90-95% of healthy equipment not falsely alarmed
- Higher specificity = fewer false alarms
- Measure: (True Negatives) / (True Negatives + False Positives)
False Alarm Rate
- Percentage of alarms that were false (no actual fault)
- Target: < 5-10% false alarms
- High false alarm rate causes alarm fatigue
- Balance with sensitivity (trade-off)
Detection Lead Time
- Time from fault detection to functional failure
- Longer lead time = more value (time for planning)
- Typical: Weeks to months for vibration-detected bearing faults
- Method-dependent: Envelope analysis detects earlier than overall levels
Challenges in Fault Detection
Early vs. False Detection Balance
- Very early detection increases false alarms
- Waiting for clear signals reduces lead time
- Optimize through multi-stage alarming
- Use multiple parameters for confirmation
Intermittent Faults
- Problems that come and go
- May be below threshold during periodic measurements
- Requires continuous monitoring or peak hold
Multiple Simultaneous Faults
- Several problems developing concurrently
- May mask each other in vibration
- Requires comprehensive analysis
- Multiple detection methods help
Multi-Parameter Fault Detection
Vibration + Temperature
- Both increasing: Confirms bearing problem
- Vibration only: Mechanical issue (unbalance, misalignment)
- Temperature only: Lubrication or friction issue
- Combined confirmation reduces false detections
Multiple Vibration Parameters
- Overall level increase + bearing frequency emergence
- Confirms bearing fault specifically
- More confident detection than single parameter
Automation vs. Manual Detection
Automated Detection
- Advantages: Fast, consistent, 24/7 capability
- Methods: Threshold checking, statistical algorithms, machine learning
- Limitations: Can miss subtle problems, may generate false alarms
Manual (Expert) Detection
- Advantages: Human judgment, context awareness, pattern recognition
- Methods: Spectrum review, waveform inspection, multi-parameter correlation
- Limitations: Time-consuming, not scalable, expertise required
Hybrid Approach (Best Practice)
- Automated detection for screening
- Expert review of exceptions
- Combines efficiency with accuracy
- Standard in mature programs
Fault detection is the foundational capability that enables predictive maintenance, identifying developing problems early enough to enable planned interventions. Effective fault detection—combining appropriate detection methods, properly set thresholds, and balance between sensitivity and specificity—provides the early warnings that maximize equipment utilization while minimizing maintenance costs and failure risks.