Sleuthing the Wiley Thermals

Yesterday, we were hit with a thermal shutdown on the big laptop.  Installing psensor and the coretemp module helped get a handle on the issue, which centered on the Nvidia GeForce 540M GPU.  Hardware drivers have always been an issue for Linux, since the Open Source software model conflicts with the need for peripheral vendors to keep the internals of their hardware secret, which they do so by not releasing the source code for the software that links the hardware with the operating system.  That’s fine for a closed, proprietary system like Microsoft Windows, the primary market.  As Linux users, not in the business of redistributing systems, we would be happy with an add-on driver that works.  But, since Linux is a small portion of the market, there is little incentive for hardware vendors to write Linux-specific driver software.  And, the software that is available is not always optimized for the Linux kernel, with the result that it either is buggy or skimps on reliability features.

Nvidia has, in the past year, incurred the ire of Linux founder Linus Torvalds for just these issues.  The Ubuntu Linux distribution that we run on our systems comes with a more or less generic Nvidia driver.  While users can download an updated driver from Nvidia, installation is a bit daunting, requiring the system to be reconfigured for text-mode login in order to rebuild the X11 graphics links.  Those of us who have been around long enough to remember hand-tuning the X-Window system configuration files, fingers poised over the “kill” keyboard sequence, ready to shut down X11 in an instant to avoid burning the monitor if the settings were wrong, have grave misgivings about tinkering with the graphics.  Plus, the forums on the ‘Net seemed to show that users were having overheating problems no matter what combinations of driver and distribution versions were used.  Custom configurations seem to be less than desirable for production machines, so we elected to look for another solution.

A bit more searching on the ‘Net came across the fwts (FirmWare Test Suite) utility package.  Once installed, it ran, pointing out compatibility issues between the computer BIOS and the kernel/driver configuration.  One of the automatic corrective actions performed by fwts was to switch the operating mode from “performance” to “normal,” which immediately lowered the operating temperature of all the components.  The GPU still shows a temperature increase under load, but the fan hardly runs anymore, whereas earlier in the week it was running on high most of the time.

The take-away message here is, updating kernels and/or drivers can and will sometimes result in conflicts with your hardware.  Linux has come a long way toward a plug-and-play, run “out of the box.” installation, but it still pays to test and evaluate hardware configurations, just like the old days of Unix.  Actually, in the “bad old days” of a few commercial Unix systems, the range of hardware combinations was often very limited, so compatibility issues had been carefully tuned out by the system vendor.  But, those systems were expensive.  In the Open Source world of Linux, where the system is expected to run on any combination of hardware on the commodity PC market, some outliers are to be expected.  For the average desktop Linux user, converting an old Windows machine to Linux will work just fine.  But, for “power users” and server applications, some engineering and testing may be required for optimum performance.  Certainly, the psensor and fwts software will be an important part of the Linux toolkit from now on.

Upgrade Woes: Changing the Way We’ve Always Done Things

A while back, a wave of learning curve frustration swept through Chaos Central.  One of the tenets of the computing life is that the march of time brings change at a mind-boggling rate.  Most Linux distributions have settled on a six-month update cycle, with almost daily patch-level updates in between.  The patch-level updates go mostly unnoticed to the average user, but sometimes quirks are induced.  But, the major distribution updates bring changes to the desktop decor ranging from the equivalent of new curtains and furniture slipcovers to knocking down walls and repainting.

Since we use our computers for productive work, we tend to keep the furniture and paint to basic office cubicle mode.  Since about 2007, we’ve tended toward running Ubuntu as the primary desktop systems.  Unless there is some very good reason, we also tend to use the Long-Term Support versions.  However, our last new computers (December 2011) arrived with Ubuntu 11.10, which was OK even though it defaulted to Unity, since the previous LTS was 10.04, which had some deficiencies in the WiFi area.  Being resistant to change, we quickly reverted to the Gnome desktop: the new but not necessarily improved Unity desktop seeming to be a dumbing-down of the desktop, and getting in the way of the usual cluttered, multi-tasking way of doing business (we don’t call our office Chaos Central for no reason).

Of course, in the spring of 2013, we suddenly were faced with the End-of-Life clock on Ubuntu 11.10.  The obvious choice, then, is to upgrade to the 12.04LTS, instead of the newly-release 13.04 version.  The default on reboot is the Unity desktop, though Ubuntu now does provide Gnome (Ubuntu Classic) as a choice.  Having recently acquired an Android phone, our first “smart phone,” we decided to give Unity a try–for a while.

OK, it isn’t so bad, once you get used to the idea that you can have multiple screens, and clicking on the launcher icon of a running process switches to it (unless there are multiple copies, but we also discovered how to display everything, a la OS/X).  However, we recently discovered the dark side to Unity: stability.

I had noticed that the fan seemed to be running on high on my main laptop, a quad-core machine with Nvidia GPU from Zareason, one of the few Linux-only system vendors.  Then, while watching a 30-minute video in full-screen, the machine suddenly powered down, due to overtemperature.  This has only happened before when I inadvertently blocked the air intakes.  Not so, this time.  It was blowing very hot air.

A bit of research into system temperature monitors led me to the psensor package, along with the coretemp kernel module and supporting software.  Yow!  The graph showed the primary culprit to be the Nvidia card, which spiked between 80 and 90C whenever a new graphics window was opened.  More research showed the most likely cause to be Unity.  Reverting to Gnome helped reduce the overall temperature, but the spikes are still there.  One of the issues here is conflicts between Ubuntu and the Nvidia native drivers.  We did have the Nvidia drivers installed until the 12.04 upgrade, but, since they don’t support the 3-D extensions in Unity, we left the Ubuntu drivers in place.

All of this brings to the fore the fact that, despite the attempt to make computers (all of them, including Apple, Microsoft, and Linux) look and feel like a big smart phone, the power desktop is not user-friendly.  The sealed-panel model employed by Apple and Microsoft means you have to live with what came out of the box, but Linux, with the open-source model, means that you can (and, by inference, must) tinker under the hood.  The upside is, that, with diligence and perseverance, you can fix it and decorate it to suit your own tastes and workstyle.