Blowers / Fans / Filters

The Increasing Challenge of Data Center Design and Management: Is CFD a Must?

Data centers have been cooled for many years by delivering cool air to the IT equipment via the room. One of the key advantages of this approach is the flexibility that it provides the owner/operator in terms of equipment deployment. In principle all that is necessary is to determine the maximum power consumption of the equipment and provide an equivalent amount of cooling to the data center. Why then, since we have been building and operating data centers for decades using air cooling, do data centers experience hot spots and fail to reach their design expectations for capacity?

This article will explain some of the challenges faced in the search for the perfect data center and why, given the variability of equipment design and the time varying nature of the data center load, CFD is not the only tool to be used in design and/or management of a data center, but is an essential tool to enable maximum performance.

Figure 1. Example showing poor airflow balancing with over-supply waste and under-supply and recirculation potentially causing overheating.

IT Equipment Airflow Is Important Design For Airflow As Well As kW

Traditional design simply ensured that the cooling system provides sufficient cooling in terms of kW per area or per cabinet. When power densities were low, this was sufficient since any re-circulation of air only resulted in moderately warmed air entering the equipment. Increasing heat densities in electronics often results in higher temperature recirculation putting equipment at risk.

The room cooling performance will be affected by the ratio of air supplied to the room compared with IT equipment airflow demand. Too much cooling air (Figure 1, left) results in wasted energy with cool air returning unused to the cooling system. Too little cooling air (right) will be compensated for by drawing potentially warm ‘used’ air in from the surrounding environment risking overheating. The designer must therefore consider airflow balance, as well as cooling power, and incorporate a strategy to address the changing needs likely during the life of the data center.

Figure 2. Example showing several different airflow regimes from equipment. Front to back, upward outflow and mixed side to side and front to back.

Impact Of Real Equipment On Cooling Performance

There are no generally accepted standards for equipment cooling. This provides a blank canvas for the equipment designer who can freely choose how to package the equipment and cool it. The equipment design will change the equipment’s demands on its environment and how best to configure the rack and the data center.

This variability in design affects the room cooling requirement by drawing air in from differing locations/faces of the equipment and similarly exhausting it from different locations and in different directions. Historically while servers have been generally designed with front to back airflow other equipment has often varied. Figure 2 shows some typical examples.

The choice of flow rate and temperature rise for the equipment also affects room airflow patterns and temperatures. A low flow (Figure 3, left) results in slow moving air that rises due to buoyancy. At higher flows (right) velocity dominates and the air shoots away without rising appreciably.

Flow rate will depend upon equipment design but also on operational details such as equipment utilization or environmental conditions. The designer and operator must recognize the need for the design to be flexible so it can accommodate the change over time. The choice of IT equipment and how it is utilized will affect the cooling performance of the data center, consequently airflow must be considered throughout data center life.

Figure 3. Example showing the impact of air volume and temperature rise on the flow from a server.

Configuring The Data Center Rack Configuration

The data center is an evolving entity in which deployments are easily made but often not easily reversed. In practice the consequences of these deployments are often not seen until later in the data center lifetime. Failure to plan, allowing for the consequences of deployments on airflow and cooling can, and commonly does, result in a gradual deterioration of cooling performance and hence capacity of the data center.  It is not unusual for a data center to experience cooling difficulties and hotspots at only two-thirds design capacity in terms of kW equipment deployed. This apparently unusable capacity is often called ‘stranded capacity.’ It is stranded because without further assessment and reconfiguration it may not be possible to use the full design capacity without creating hotspots.

This is not just a data center issue at room level alone but is affected by deployment decisions within the cabinet too. It is common to consider issues such as blanking when placing equipment in a cabinet but less common to consider equipment interaction. Figure 4 shows the air circulation and temperatures in a cabinet where only two IT deployments have been made. First a blade system has been deployed and then three 1U servers added in the slots immediately above. In this instance the hot air from the blade system is confined to the lower part of the cabinet and recirculates under the cabinet causing high inlet temperatures and reduced resilience.

Figure 4. Blade system deployed in cabinet below 1U servers.

If the servers are deployed in the opposite sequence (Figure 5), with the 1U servers in the lower 3 slots and the blade system above, then the recirculation is dramatically reduced resulting in lower inlet temperatures and greater resilience.

The configuration of equipment inside the cabinet can equally affect room conditions. Figure 6 shows that, with no modification to rack cooling strategy, the room conditions and equipment temperatures are dramatically changed when front to back ventilated servers are replaced by routers with mixed airflow of the same power.

Room Configuration

The primary challenges for the owner operator are:

a. How to deploy different types of equipment in the data center side by side that have different demands for airflow and cooling when the data center was intended to provide a relatively uniform cooling capability;

b. How to be prepared for and accommodate future generations of equipment with characteristics yet unknown.

It is unlikely that the cooling system will deliver airflow and cooling uniformly throughout the data center. This means that the owner/operator must be aware of the impact of location of deployment in the context of the actual performance of the data center. Consider a scenario where 2 new Sun M5000s are to be deployed. Figure 7 shows the impact is different depending on choice of installation location even though in principle all the chosen locations have space power and cooling.

Figure 5. Blade system deployed in cabinet above 1U servers.

The different results are not easily predicted using simple rules as they result from the combination of equipment heat load, equipment airflow but also the configuration, in particular the interaction between the equipment and room airflows.

The increase in power density and soaring energy costs, combined with the growing awareness of a need for environmental responsibility, has stimulated the fundamental design of data center cooling to be revisited. New approaches, such as aisle containment, are implemented to increase efficiency but the increased efficiency will only be achieved if equipment airflows are appropriately controlled/balanced. If this is not done potentially high temperature damaging recirculation can still occur, Figure 8.

Matters affecting the cooling performance are not limited to equipment deployment alone but also the infrastructure such as ACU & PDU deployment and location and size of cable routes.

So should CFD be used?

Figure 6. Example showing front to back servers cool while routers are hot in the same configuration.

Alternative approaches to design and analysis such as rules of thumb and hand calculations or more sophisticated models such as modeling using potential flow theory provide an insight into cooling performance. However, they all, to a greater or lesser degree, fail to account fully for the important issue of airflow and its momentum.

Computational Fluid Dynamics, or CFD for short, provides a unique tool capable of modeling the data center and equipment installation in conceptual design right through to detailed modeling for operation. Providing a 3-dimensional model of the facility can account for almost any feature and combination of features, in a manner very similar to its use in electronic equipment design. However, modeling a data center can be somewhat more challenging since the data center is a dynamic changing ‘electronics design.’ The above illustrations show the potential for CFD to predict the significant impact of these small variations.

So, what are the barriers that must be overcome for successful use of CFD? From an academic point of view people may point to the more theoretical aspects of CFD such as turbulence modeling, gridding, solution time and indeed including the full physics, but, in practice, these difficulties are normally far outweighed by the difficulty of capturing the true configuration (such as the features described above) in sufficient detail to predict the resulting environment.

Figure 7. Example showing 2 different spatial deployments of 2 items of equipment – only option 2 works.

CFD can relatively easily be used for concept design decisions but even here there is need for education about the significant risk of ignoring equipment type and resulting airflow, temperature affects and likely deployment locations. For real facilities it is even more challenging and it is essential that measurements are made alongside the modeling process in order to ascertain that the model reflects reality. Why? Because some details can never be represented precisely (e.g. unstructured cabling, damper settings …) and others may depend on operational factors such as equipment utilization. For the latter it is hard to gather airflow and heat dissipation data for equipment to be able to characterize them fully in deployment. Here the electronics industry can make an important contribution by publishing data more openly and indicating likely trends for planning purposes.

Figure 8. Recirculation in a poorly contained scenario.

Once this careful data gathering exercise has been achieved then, with the right tools, it is relatively straightforward to build a CFD model of the data center and check that it reflects reality. Once a ‘calibrated’ model is achieved it can be used with reasonable certainty to make deployment decisions and undertake other tests such as failure scenarios with confidence. In the view of the authors it is imperative, if the potential for stranded capacity is to be minimized, that simulations be undertaken frequently to understand the implications of deployment decisions and that monitoring and comparison with measured data is made regularly to ensure the model continues to reflect reality. With the increasing availability of (live) measured data, maturing CFD tools and increasing pressure on energy efficiency but still maintaining availability, using CFD with an appropriate methodology can be a critical tool for design and management.

Comment