When Products Fail

When products fail, we start to wonder

Spectacular failures tend to attract much attention. It is simple curiosity as to why it happened and why something was not done to prevent it. One such event was the engine fire on a Boeing 777 flight over Denver in February 2021. This was documented with a video shot from within the plane. The front of the engine is all shattered, the back of the engine is on fire, and later reports showed parts of the engine scattered over somebody’s backyard. Early reports stipulate metal fatigue of one of the engine’s blade which caused it to break off within the engine as the cause of the fire.

 

Designing for the one-off failure

There are two key questions to ask: How was it possible for this to happen and, how was it possible for the plane to land safely? I asked a friend who is an experienced jet engine designer. He said the answer is simple: engineers did predict this type of a failure, did study the possible consequences, and did design the rest of the plane so that it could respond appropriately, and the pilots were trained on how to react. He knows that because the plane landed safely with nobody being injured. In other words, the system worked as designed.

He then explained that a failure like this is always considered in two ways: how to prevent failure of the part, and how the rest of the design needs to respond in case of such failure. He said one needs to think of a system instead of a product and follow the following steps:

  • The structural integrity and lifespan of the part (the blade) are studied using simulations and physical tests making sure that failures cannot occur within the 3-sigma of the operational space.
  • The surrounding structure (the engine) of the part is purposefully designed to carry the physical and functional load of the part in case it fails.
  • The system within which this structure exists (the plane) is designed to function in case of a total failure of that structure.
  • Planning for human intervention (depot maintenance procedures and pilot training) is made part of each design decision.

He also pointed out that this entire decision process is based on the Baysian probability of a failure (look it up, I had to…) because one cannot crash a plane to know what that probability really is. And that the safety factors/margins are huge by design; for example, Boeing 777 is designed to fly half of its scheduled journey even if both engines fail since that half is the furthest possible distance to the closet airport.

So, while some failures may look like spectacular failures of engineering, taking a system approach to the design turns that interpretation on its head—that engine failure was in fact a great success of a system design. What apparently failed is a sufficient frequency of the blade inspections to compensate for the shortcomings of the predictive simulations during the design of the part and the Baysian probability approach.

 

Designing for the certain failure

There are situations in which the product is expected to encounter a catastrophic event. This came up during a discussion I had about fighter jets and the threat of heat seeking missiles. The plane cannot outrun the missile and the engine cannot be shut-off to hide its heat. And if the missile does not destroy the plane at the instant of the explosion it will likely cause catastrophic damage to the engine and the plane’s hydraulic and electronic controls. It is a no-win situation. But is it?

Apparently, this was debated without resolution until somebody stepped back and looked at the problem as a system of systems: a missile, an engine, and the interaction between the two. If something could be done to neutralize the negative effects of the explosion, then it would not matter if the explosion occurred. And there are only two such negative effects: explosion shock wave and missile debris.

The result was a conceptual design of a jet engine whose hypersonic thrust was sufficient to eject the debris and the shock wave before they could reach the internal parts of the engine. Brilliant! I cannot tell how far this concept has been taken in practice, but I do not recall too many US jet fighters being lost in a war for a long time. Again, a system approach to design transformed the certainty of a failure into a success.

 

Conclusion

Not all product failures are unavoidable, but most are predictable and Systems Thinking has proven to be very effective identifying and mitigating their impact. The key is to start with a system-of-systems model that accounts for the emergent behaviors between the systems (with a known or a Baysian probability)—and make that the start of a digital thread. That thread is key to tracing failure patterns against a history of the design (requirements, simulations, implementation domains, changes, etc.) and correcting the related assumptions in the system definition or any other parts of what the thread connects. It is also critical to relating input from the field (ex: IoT) to the proper digital twin of a serialized asset that accounts for all maintenance and modification activities after it left manufacturing.

Featured Product

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

The FLIR Si1-LD is an easy-to-use acoustic imaging camera for locating and quantifying pressurized leaks in compressed air systems. This lightweight, one-handed camera is designed to help maintenance, manufacturing, and engineering professionals identify air leaks faster than with traditional methods. Built with a carefully constructed array of MEMS microphones for high sensitivity, the Si1-LD produces a precise acoustic image that visually displays ultrasonic information, even in loud, industrial environments. The acoustic image is overlaid in real time on a digital image, allowing you to accurately pinpoint the source of the sound, with onboard analytics which quantify the losses being incurred. The Si1-LD features a plugin that enables you to import acoustic images to FLIR Thermal Studio suite for offline editing, analysis, and advanced report creation. Field analysis and reporting can also be done using the FLIR Acoustic Camera Viewer cloud service. Transferring of images can be managed via memory stick or USB data cable. Through a regular maintenance routine, the FLIR Si1-LD can help facilities reduce their environmental impact and save money on utility bills.