Fig 1 The bathtub curve illustrates the likelihood of failure for foundry equipment over the course of its service life

Fig. 1: The bathtub curve illustrates the likelihood of failure for foundry equipment over the course of its service life.

An Intelligent Approach to Foundry Maintenance

Time-based maintenance Documenting failure Company-wide commitment 10 steps to reliability 15 benefits

Most companies use time-based maintenance as their primary method of discovering failures and addressing issues. Time-based maintenance is a program of inspections or repairs that are performed according a time schedule or cycles: once a week, once a month, or once a year. This approach also involves costly semiannual or annual shutdowns for maintenance repairs and overhauls.

In many cases, time-based maintenance is not the correct approach to eliminating component failures. Often, equipment fails shortly after the preventative maintenance was completed. This is not the fault of the operators or the maintenance professionals, however, it can be blamed on traditional mindsets in the foundry business. For many decades, there has been a belief that all failures are age-related; in other words, it is commonly assumed that every piece of equipment has a useful life, and after the end of this life has reached, the equipment needs to be overhauled, or replaced.  This assumption still exists and Figure 1, known as a “bathtub curve,” demonstrates the frequency of age-related failures.

The bathtub curve indicates that at the beginning of its service life, the equipment has a high chance of failure; this could be understood as “start-up issues.” Then, it has an equal chance of failure for some time, and once the equipment reaches its useful life, the likelihood of failures increases, at which point it needs either to be replaced or overhauled.

In the early 1980s a comprehensive study of this concern was completed, but it never was adopted by many industries. The study showed that 89% of all failures in any industry are random incidents, and do not follow the bathtub curve. This was a revelation in many industries, and then it was realized that the traditional preventative maintenance is not effective for preventing random failures.

This finding holds true for foundries, too. However, most foundries still use the traditional time-based maintenance, which is why failures keep occurring even though the PM programs are completed.

Every time I ask operators how we should eliminate downtime in foundries, I receive the similar answers: Better preventative maintenance programs; more inspections; keeping spare parts on hand; implementing Total Productive Maintenance; better operating procedures; more operator training; more maintenance training; buy better parts; and dedicate more time to perform the required maintenance.

None of these replies can really answer the basic question: How to eliminate equipment failures.  In fact, all these answers are premised on the traditional understanding of equipment maintenance.

The only effective answer to this question is, in order to eliminate equipment failures we must make the equipment 100% reliable.

The State of Reliability

More realistically, in order to begin to make equipment more reliable, you must start by improving the existing state of reliability. However, improving reliability requires discipline and support from all management levels; it cannot be accomplished by one individual, or department in the organization. In addition, it has to become the company program and not just another flavor of the month. The mindset has to be changed so it can become the way of life in the company.

There are 10 steps that must be followed in order to implement a successful reliability program. Furthermore, none of these steps can be bypassed.

1.  Maintenance workers and operators shall work together as a team, to identify the components that cause the most failures. This approach will dissolve the customer-supplier relationship between the maintenance and operation functions, and create a more cooperative environment.   

2.  Maintenance and operation teams shall work together to identify the type of failures.  The correct type of failure must be identified in order to select an accurate maintenance option. This is a crucial step in making the equipment more reliable.

3.  Maintenance and operation teams must determine the risk of the failure. What happens if a piece of equipment fails? Does the failure cause environmental or safety issues, or does it cause major downtime or product rework?

4.  Maintenance and operation must determine the frequency of the failure, meaning how often the failure occurs.

5.  Currently what method is being used to detect the initial issue, prior to the final failure? Meaning, is there a way to detect this failure before it occurs?

6. What kind of countermeasure should be used to stop this failure?

7.  Is the countermeasure feasible? Is it possible to implement the “fix,” or is it financially viable? In many cases, the first option might not be practical, and we should choose another option.

8.  How does our countermeasure affect Steps 3, 4, and 5? Our countermeasure should be a permanent fix for the failure … or we need to reexamine steps 2 and 7.

9.  The team shall develop the right metrics to measure success, such as OEE, Availability, cost, backlog, etc.

10.  Implementing and sustaining the program throughout the plant and making it part of the culture.

Totaling the Results

If the foundry operation can develop and maintain this type of culture, one in which maintenance workers and operators, everyone, is “reliability minded” the program can produce fifteen significant benefits to the organization. These are:

1.            Eliminating PM efforts that have no value.

2.            Eliminating downtime.

3.            Reducing emergency work-orders.

4.            Reducing maintenance overtime.

5.            Increasing plant productivity.

6.            Reducing annual shutdown duration, and cost.

7.            Improving maintenance planning.

8.            Identifying training requirements for maintenance and operations teams.

9.            Truly identifying operations and maintenance responsibilities.

10.          Reducing total maintenance cost.

11.          Creating a cooperative environment between operation and maintenance.

12.          Identifying the true capital projects that need to be completed to reduce failures.

13.          Identifying design issues that need to be addressed.

14.          Identifying weaknesses in operating procedures.

15.          Identifying the correct spare parts that must be kept in inventory.

Kaveh Golestaneh is the developer of Simplified Maintenance Reliability for Foundries (SMRF). He is an electrical engineer with a degree in Physics, and he notes he has spent most of his professional career in the metalcasting industry, having held various positions ranging from plant engineer and maintenance manager to plant manager, with companies that include General Electric, Gunite, EMI, and Hitachi Metals.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.