Ten stupid things engineers do to mess up their cooling

January 1st, 2000

Editor’s Note: On the eve of the new millenium, the editors of ElectronicsCooling have decided to depart from our normal serious tone just this once, unless, of course, we can convince Tony to contribute another story or two for us sometime [soon!] in the next thousand years. Herewith, for your amusement and edification is a humorous, yet insightful, retrospective of famous – or shall we say infamous – missteps in thermal management.

I was amazed that the hospital just let my wife and me take home our precious, fragile, newborn baby, even though we had no idea how to take care of her. I couldn’t think of another field where they just hand you the job without any proper training, and – even when you admit you know nothing about it – they smile and say, “You’ll do just fine,” until I thought of my own adopted field – electronics cooling.

Compared to other engineering disciplines, electronics cooling has, I think, more than its share of “amateurs.” They are like my buddy Herbie, an otherwise well-meaning and hard-working engineer, who got stuck doing thermal design part-time without any training. It has been this way for decades, so it’s no surprise that a whole folklore of myths and just plain bone-head ideas has been passed down from one generation of electronics engineers to the next.

If you’ve witnessed these, you may chuckle. If you have fallen for one or two yourself, blame them on your previous boss.

Stupid Thing No. 1. The Magic Heat Sink.

Herbie thinks that aluminum has the mysterious power to absorb heat like a sponge. If a component is too hot, stick a slab of aluminum on it. It sucks out heat without increasing in temperature.

I try to explain that a heat sink works by increasing the surface area in contact with the air.

But he still asks, “If you put grease between the device and the sink, it gets cooler. How much will it go down if I just put on the grease by itself?”

He thinks about aluminum the way that people believe copper bracelets cure arthritis.

Stupid Thing No. 2. Sizing Fans.

An electrical design manager calls. Two weeks from production release, his new circuit board is too hot. After a few measurements, I tell him, “With this power, to get the temperature you want, you’ll need about 300 cubic feet per minute (CFM) of air. I measure only about 100.”

“Impossible!” the manager erupts, “It has a 300 CFM fan!”

I diplomatically explain that the 300 CFM rating is the maximum flow when there is absolutely no obstruction. Any real chassis has some flow obstruction, so the actual delivery is much less.

“Guess we’ll have to slap on another fan or two,” he concludes.

Stupid Thing No. 3. Adding fans as an afterthought.

You can’t just “slap on” a fan. Unless a system is designed from the beginning with fans in mind, adding a fan at the end isn’t likely to improve component temperatures much. Fans need defined flow paths, room for plenums, filters, alarms, and power supplies.

“Slapped on” fans have the inlet vent right next to the outlet vent, so that the hot air gets sucked back into the chassis.

“Slapped on” fans are the ones that blow in the customer’s face when she is operating the controls.

“Slapped on” fans are like the one in my PC that keeps the CPU from burning up – but doesn’t have any alarm to tell me when the fan seizes up. So my $300 CPU may burn up anyway.

Stupid Thing No. 4. The 20°C Rule.

I have seen an old rule of thumb written into design requirements:

The air temperature rise from the inlet to the outlet of the chassis shall not exceed 20°C.

That was the ONLY thermal requirement!

It was quite popular, because it was much easier to do this test than to actually measure the temperature of all the components inside. If the total power was low, it would have been easy for some high power parts to be way over their temperature limits inside the box, and you would never know.

It’s like trying to find out if anybody in the hospital is running a fever by waving a thermometer in the air near the emergency room exit.

Stupid Thing No. 5. Thermocouples – Getting the wrong temperature was never so easy.

A thermocouple seems to work by magic – a voltage is produced by a temperature difference, which, to some folks, is the same as thin air. No wonder nobody knows how they work, or can tell when they don’t work.

Common errors, like reversing the polarity, should be easy to avoid, but keep happening anyway. They happen because Herbie doesn’t know that a thermocouple only measures temperature difference.

To get the actual temperature, the meter has a built-in temperature sensor to measure the temperature of the meter itself.

The meter adds the signals from the thermocouple and the internal sensor, and displays the sum.

Herbie merrily hooks up J-type wire to a T-type meter. He turns on the meter and is happy because the display gives the correct room temperature. It could be hooked up backwards and give the same reading.

As long as the two ends of the thermocouple are at the same temperature, it produces zero voltage, and the only thing he has tested is the internal sensor. His little “reality-check” at room temperature convinces him all his bogus data after that is good.

Stupid Thing No. 6. Not discussing work enough with the family.

Even I have trouble explaining the difference between heat and temperature to my mother-in-law.

No wonder that the Reliability Department issues requirements such as:

“No component temperature shall exceed 80% of its operating temperature limit.”

They mean that, if a vendor rates a part to work up to 150°C, we should leave a safety margin below that, so our limit should be 125°C.

But derating by a straight 80% is ridiculous, because temperature is a relative scale, and the values you get depend on the units you use. The Kelvin scale would give us a limit of only 65°C for the same component.

Stupid Thing No. 7. Ignoring the cold end.

It is easy to think that the “worst case” for any electronic assembly is at the maximum ambient. (I have been guilty of this myself.) But maybe your product has to operate outdoors or in an unheated underground vault.

Many commercial components don’t work below 0°C. Things like batteries, capacitors and crystal oscillators behave very strangely.

You can’t rely on your high ambient instincts. At high ambient, you test the fully equipped system at maximum power. The worst case at the low end may be with a partially equipped system at minimum power. And slapping on a heat sink to fix a problem at the high ambient may make the cold end worse.

If, as I did, you think this challenge is trivial, just talk to an electronics engineer in the auto industry.

Stupid Thing No. 8. Expecting accurate results from guesswork inputs.

I use a fancy CFD program to predict component temperature in electronic assemblies long before they are built. My customers, the electrical designers of those assemblies, drool over the color isotherm plots. But before long they ask, “How good are these predictions? Within 5°?”

I answer, “How good were the power numbers you gave me?”

Then they respond, “When I said each device was 1 W, I meant as low as 0.25 W, or as high as 2.1 W. Does that matter? I’m still hoping for 5° accuracy. Isn’t that why you bought that big computer?”

Stupid Thing No. 9. Reducing temperature because “every 10°C drop doubles the life.”

This is still the gospel of the land. It started with the U.S. Department of Defense Military Handbook 217, which became the standard for electronics reliability. The 10°C rule was part of it.

Too bad it’s not true. Not even the military uses 217 anymore. But like your mom’s rule about not swimming for one hour after eating, this rule lives on.

The alternative is very messy. There are temperature limits that improve the reliability of electronics. But to apply them, you have to understand the physical processes that cause failures in each type of component. That’s hard to boil down to a slogan.

Stupid Thing No. 10. Not getting the proper training.

I once worked for a company that replaced their drafting boards with a CAD system. But not a dime was spent to train their people. They struggled, secretly drawing on paper, and copying that into the computer. Then the managers complained that they weren’t seeing the productivity improvements promised by the CAD vendor.

It’s fun to laugh at semi-fictional Herbie. But the people on which he is based make stupid mistakes, not because they’re stupid. They are working with the thermal myths they found lying around the lab when they got there.

The trend in electronics is that size is going down, power is going up, and temperature problems are getting more severe. Myths won’t cut it in the 21st Century.

There is good news, though. Unlike parenting a newborn, plenty of thermal design training is available. Sign up for some, so you don’t recognize yourself in next year’s “Ten Stupid Things” list.

One Response to “Ten stupid things engineers do to mess up their cooling”

  1. It interested me to see that stupid thing No 6 was precisely what I was arguing against in arecent program.
    Training is essential in some of these to identify that tools may not be adequate till you understand them. I saw one recently wherein if I increased the thickness of plate, the thermal resistance in degC/W doubled!!

Comment