Sunday, December 11, 2011

The Time is Right for Service

In today’s current economic times, companies are looking at all scenarios to minimize costs, including delaying the purchase of new equipment and squeezing as much as possible from existing equipment. While this may assist an organization from incurring capital costs in the short run, it may result in increased costs due to data center outages if preventive and proactive maintenance activities are not regularly performed by a certified, factory-trained technician.

In fact, according to a recent Emerson Network Power/Ponemon Institute report, “Calculating the Cost of Data Center Outages (PDF),” the average cost of a data center outage to an organization can exceed half a million dollars.

Maintenance Planning

While data center and IT managers have been working feverishly to avoid expensive outages and support companies that have been on the go for months, year’s end may be an opportune time to review data center infrastructure service strategies and perform much needed maintenance. In many organizations, the IT infrastructure has evolved into an interdependent, business-critical network that includes data, applications, storage, servers and networking. A power failure at any point along the network can affect the entire operation — and have serious consequences for the business.

A proactive view of service and maintenance in the data center enables a data center manager to maximize availability, capacity and efficiency of critical infrastructure. Performing regular preventive maintenance significantly reduces the chances of downtime.

Power Equipment Maintenance

Regular service activities on critical power equipment should include:

A complete visual inspection of the equipment. This should include sub-assemblies, wiring harnesses, contacts, cables and major components.

Visual inspection of all breakers including counting temperature, connections and associated controls.
Checking air filters for cleanliness.
Reviewing AC and DC power capacitors for swelling and/or leaking.

Taking the time to record all voltage and current meter readings on the module control cabinet or the system control cabinet.

Measuring and recording harmonic trap filter currents.

Inspecting and recording all electronics and bring to system specifications as needed.

Installing or performing any Engineering Field Change Notices (FCN) as needed.

Determining and recording all low-voltage power supply levels.

Calculating and recording phase-to-phase input voltage and currents.

At the end of this service, perform operational test of the system including unit transfer and battery discharge. In addition to the mission critical power and distribution equipment, all mechanical systems require preventive maintenance to ensure optimum performance.

Service for Cooling Products

Cooling modules have moving parts that eventually wear out. The purpose of maintenance of this equipment is to make those components last as long as possible, perform within their originally designed operating parameters and to replace parts before they fail. This is especially crucial in today’s data center environment where downtime can have catastrophic effects on a business.

Maintenance tasks can vary from model-to-model. Users should collaborate with their local authorized service representative and consult their user manual for a complete list of applicable tasks for their equipment.

Common preventive maintenance activities for cooling infrastructure should include:

Inspection and replacement of air filters. Clogged air filters reduce the airflow through the systems and increase the load on the blower drive system. This may result in reduced system cooling performance, higher operating costs, reduced component life of the blower drive systems, and higher operating temperatures of the equipment in the data center.

Blower drive system inspection and maintenance. Wear or damage to blower belts, bearings, motors and wheels may result in loss of airflow or reduced cooling performance.

Steam generating and infrared humidifiers. Humidifiers may be connected with valves and hoses that may leak and drains may become clogged over time. IR Humidifier bulbs may burn out. These components should be inspected regularly.

Condensate drains and pumps. Confirm proper pump function and verify drains are not clogged. Obviously, the combination of a clogged drain and a failed level sensor results in pan overflow.
Inspect and clean reheat elements – review and tighten the supporting hardware.

Examine the oil level of compressors and check for leaks. Compressors running with too much or too little oil will see diminished service life. Always use the same type of oil supplied with the compressor from the OEM.

Evaporator coils should be checked periodically to verify they are clean and free of debris. As you might imagine, dirty coils are less efficient at removing heat.

Condenser coils should be checked periodically to verify they are clean and free of debris. Motor mounts should be tight and bearings should be uninhibited and in good condition.

To minimize unit-related failures, comprehensive maintenance programs with OEM-trained and certified technicians are recommended. When correctly implemented, maintenance programs ensure maximum reliability of data center equipment by providing systematic inspections that can lead to detection and correction of initial failures, either before they occur or before they develop into major defects that can result in costly downtime. Typical PM programs include inspections, tests, measurements, adjustments, parts replacement and housekeeping practices.

An Emerson Network Power study of the impact of preventive maintenance (PM) on UPS reliability revealed that the Mean Time Between Failures (MTBF) for units that received at least two preventive maintenance service visits a year is 23 times better than a UPS with no preventive maintenance visits. According to the study, reliability continued to steadily increase with additional visits when conducted by highly trained engineers.

The outcome of the model can be seen in the figure below, which depicts the expected MTBF figures projected up to six PM events per year. The mathematical model incorporated real-world data to arrive at the result. The MTBF estimate for the “no PM” group is substantially lower than the observed MTBF for units with emergency service only contracts, but is in line with the lifespan of components that must be replaced. There is a substantial increase in MTBF from zero to six PM visits per year. When projected out farther than six PM visits, the MTBF begins to level off at 19 PM visits per year and then declines at higher levels of maintenance. This decline can be attributed to the fact that every service event introduces the possibility of service-related human error.
At least two PM visits per year are recommended, but additional maintenance visits maybe needed for facilities where downtime is unacceptable. Depending on the cost of downtime for a particular application, a high return on investment can be realized in many cases by increasing PM frequency.

A formal service strategy that includes regular preventive maintenance visits will increase the availability and reliability of your UPS, PDU, and batteries. It should be noted however, that this is only a first step. You must also protect your entire electrical infrastructure – from the service entrance switchgear down to the rack mounted PDU to ensure and maintain Business Critical Continuity.

0 comments:

Post a Comment