From “Fix It When It Breaks” to “Don’t Let It Break:” The Evolution of Maintenance Thinking
By Andre Tomaz de Carvalho
Innovation Leader & SME in Condition Monitoring | High-Voltage APM Solutions
The way we maintain assets has been quietly revolutionizing industries for decades — and the next shift is already here.
I've spoken with plant managers who still run on gut instinct, and with reliability engineers deploying machine learning models to predict failures weeks in advance. Both approaches "work" — until they don't. The difference between them isn't just technology. It's a fundamentally different philosophy about failure, risk, and value.
Here's how maintenance thinking has evolved — and where it's heading.
A Century of Maintenance: The Historical Arc
To understand where we are, it helps to understand when each paradigm was born — and what forced it into existence.
The Industrial Revolution through WWI (1760s–1918): The Age of the Blacksmith
Early industrial machines were mechanical marvels, but also relatively simple — steam engines, looms, early locomotives. When something broke, you called the equivalent of a blacksmith. Maintenance was craft knowledge, passed down through apprenticeship. There were no manuals, no schedules, no frameworks. Failure was just a fact of industrial life, and factories were designed around it with massive redundancy and buffer stock.
The Interwar Period & WWII (1920s–1945): The Birth of Preventive Thinking
As production lines grew more complex and interdependent — pioneered by Ford's moving assembly line in 1913 — a single failure could halt an entire factory. The military, facing catastrophic consequences from aircraft and naval vessel failures, began formalizing the idea of scheduled maintenance. WWII was a crucible: the logistics of maintaining tens of thousands of aircraft, tanks, and ships at scale forced the U.S. and Allied militaries to develop the first systematic maintenance programs. The concept of the "overhaul interval" was born from wartime necessity.
The Post-War Industrial Boom (1950s–1970s): Reliability Engineering Emerges
The 1950s and 60s saw the formalization of maintenance as a discipline. The U.S. Department of Defense published its first reliability standards. The nuclear and aviation industries — where failures were catastrophic and public — drove the development of Failure Mode and Effects Analysis (FMEA) and Reliability-Centered Maintenance (RCM). United Airlines' landmark 1968 MSG-1 study (later evolving into MSG-3) was a turning point: it was the first systematic analysis showing that scheduled overhauls were often unnecessary and sometimes harmful. This was the intellectual foundation for everything that followed.
The Oil Crisis & Computerization (1970s–1980s): CBM Takes Root
The 1973 oil embargo forced industries to squeeze every ounce of efficiency from their assets. Wasting maintenance effort on healthy equipment became economically unacceptable. Simultaneously, the miniaturization of electronics made portable vibration analyzers and oil analysis kits commercially viable for the first time. Condition monitoring became practical on the plant floor, not just in laboratories. The first Computerized Maintenance Management Systems (CMMS) appeared, bringing data discipline to work order management.
The Digital Revolution (1990s–2000s): Data Arrives, Insights Don't
The 1990s brought networked sensors, SCADA systems, and relational databases into maintenance operations. Companies were suddenly drowning in data — vibration trends, temperature logs, pressure histories — but largely lacked the analytical tools to turn it into foresight. This era was characterized by an irony: unprecedented data richness, yet maintenance decisions were still made largely on intuition and spreadsheets. The gap between data collected and value extracted was enormous.
The AI & IoT Era (2010s–present): Prediction Becomes Possible
Cloud computing, machine learning, and affordable IIoT sensors converged around 2012–2015 to make true predictive maintenance economically viable at scale — not just for aerospace giants, but for mid-sized manufacturers and utilities. The COVID-19 pandemic of 2020 accelerated the adoption dramatically, as remote monitoring became essential when technicians couldn't access sites. Today, the frontier is moving from prediction to prescription: not just knowing what will fail but deciding what to do about it in the context of business risk, safety, and operational reality.
Do you feel you still live in the past? Do you need to get back to the future?
Usually, the journey to the state of the art doesn't follow this chronology. Let's take a look.
From Reactive to Intentional: The Five Stages of Maintenance Maturity
Stage 1: Corrective Maintenance — "If It Ain't Broke, Don't Fix It"
This is where every industry began — and where it remained for most of human history. A machine fails. You fix it. Simple, intuitive, and still practiced widely today.
Corrective maintenance (CM) made perfect sense in the early industrial era, when assets were simple, redundant, and cheap to repair. James Watt's steam engines of the 1780s were maintained this way. So were the telegraph lines of the 1850s and the early automobiles of the 1900s. The logic was straightforward: why spend money maintaining something that's still running?
The hidden costs, however, are brutal:
- Unplanned downtime cascades across production lines;
- Secondary damage — a failed bearing destroys a shaft, which damages a housing;
- Safety incidents spike when failures are sudden and uncontrolled;
- Inventory chaos — you never know what spare parts you'll need, or when.
For non-critical, low-consequence assets, corrective maintenance remains entirely rational. But as industrial complexity grew, its limitations became impossible to ignore.
Stage 2: Preventive Maintenance — "Change It Before It Fails"
The aerospace and automotive industries pioneered the shift in the 1940s and 50s. If you couldn't predict when something would fail, you replaced it on a schedule — before failure had a chance to occur. The U.S. Air Force's early jet engine programs mandated fixed overhaul intervals, and the approach spread across industries through military procurement standards like MIL-HDBK-217.
Time-based preventive maintenance (PM) brought discipline and predictability. Lubrication intervals, filter changes, overhaul schedules — all driven by manufacturer recommendations and statistical averages.
It was a genuine leap forward. Unplanned failures dropped. Safety improved. Asset life extended.
But a new problem emerged: over-maintenance.
Studies famously found that up to 68% of equipment failures are random — they have no relationship to age or time in service. Replacing a component every 6 months because the manual says so, when it has another 18 months of life left, is pure waste. Worse, every maintenance intervention is itself a risk — improperly reinstalled components, introduced contaminants, and human error cause a well-documented spike in failures immediately after scheduled maintenance.
We were solving one problem and creating another.
Stage 3: Condition-Based Maintenance — "Maintain It When the Asset Tells You To"
The insight was elegant: instead of guessing when something might fail based on time, why not listen to the asset itself?
Condition-based maintenance (CBM) emerged in the 1970s and 80s from advances in sensing and diagnostics — driven partly by the oil industry's need to keep offshore platforms running without the luxury of easy access. Vibration analysis, oil sampling, thermography, ultrasonic testing, motor current analysis — these techniques allowed maintenance teams to assess the actual health of equipment in real time.
The shift was profound. Maintenance was no longer calendar-driven. It was evidence-driven.
An oil analysis revealing abnormal metal particles triggers a gearbox inspection — not because it's been 90 days, but because the data says something is wrong. A vibration signature trending upward on a pump bearing gives you a window to intervene before failure.
CBM dramatically reduced unnecessary maintenance while catching real degradation early. For rotating equipment, electrical systems, and fluid-power assets, it became the gold standard.
The remaining gap? CBM tells you the current condition. It doesn't tell you what comes next.
Stage 4: Predictive Maintenance — "Know the Failure Before It Happens"
The convergence of IoT sensors, cloud computing, and machine learning — roughly between 2012 and 2018 — unlocked something previously impossible: predicting failures days, weeks, or months before they occur. GE's famous "Industrial Internet" initiative in 2012 put predictive maintenance on the boardroom agenda. By 2016, it had moved from pilot programs to mainstream industrial practice.
Predictive maintenance (PdM) goes beyond monitoring current condition to modeling the trajectory of degradation. Algorithms trained on historical failure data detect subtle patterns invisible to human analysts — micro-changes in vibration frequency, thermal gradients, acoustic emissions — and project when a component will cross the threshold into failure.
The business case is compelling:
- Failures caught in early-stage degradation cost a fraction of run-to-failure repairs;
- Maintenance can be planned during scheduled windows, not emergency shutdowns;
- Parts and labor can be pre-positioned — no scrambling.
Industries with high asset criticality — power generation, aviation, oil & gas, manufacturing — have seen extraordinary ROI. Predictive maintenance programs routinely deliver 10–25% reductions in maintenance costs and dramatic improvements in availability.
But here's the challenge predictive maintenance alone doesn't solve: not all failures are equally important.
A predicted bearing failure on a primary compressor in a gas processing facility and the same failure on a secondary ventilation fan are not equivalent events — not in consequence, not in urgency, not in the right response.
Stage 5: Risk-Based Prescriptive Maintenance — "Do the Right Thing, at the Right Time, for the Right Reason"
This is the frontier of the 2020s. And it represents the most complete rethinking of what maintenance is for. Emerging from the limitations of pure predictive approaches — and accelerated by post-pandemic pressure on supply chains and operational resilience — risk-based prescriptive maintenance is redefining the role of the reliability function entirely.
Risk-based prescriptive maintenance (RBP) combines predictive analytics with consequence modeling, risk quantification, and decision optimization to answer not just what will fail — but what should I do about it, and when?
The framework integrates:
- Probability of failure (from predictive models);
- Consequence of failure (safety, environment, production, cost);
- Risk = Probability × Consequence— prioritized dynamically across an asset portfolio;
- Prescriptive actions — not just alerts, but recommended interventions with cost-benefit analysis.
A prescriptive system might tell you: "Compressor C-204 has a 73% probability of bearing failure within 18 days. Given its criticality to Train 2 throughput and current production economics, the optimal intervention window is this weekend's planned outage. Delaying 30 days increases expected cost by $340,000."
This is no longer just maintenance intelligence. It's business decision support.
Risk-based prescriptive maintenance also incorporates:
- Regulatory and safety risk — flagging assets where failure consequences trigger compliance or HSE obligations;
- Supply chain constraints — adjusting recommendations based on spare parts availability;
- Operational context — accounting for production schedules, asset redundancy, and current market conditions
The human is still in the loop — but the decision is now informed by a level of analytical rigor that was simply impossible a decade ago.
Each paradigm shift in maintenance has followed the same arc: from reacting to events, toward intentionally shaping outcomes.
The organizations winning on reliability today aren't just deploying better sensors or better algorithms. They've made a philosophical commitment: maintenance is not a cost center managing failures — it's a value center managing risk.
Where Are You on the Journey?
Most organizations I work with are somewhere between stages 2 and 4 — and that's completely normal. The journey isn't about leaping to the most advanced paradigm overnight. It's about understanding where you are, what the next step looks like, and building the capabilities — data, talent, culture, and tools — to get there.
The assets that power our world deserve more than reactive repairs and calendar-based guesswork. They deserve intelligence.
That's the value we bring to our customers. That's our commitment. That's Cutsforth's mission.