Yesterday, we were hit with a thermal shutdown on the big laptop. Installing psensor and the coretemp module helped get a handle on the issue, which centered on the Nvidia GeForce 540M GPU. Hardware drivers have always been an issue for Linux, since the Open Source software model conflicts with the need for peripheral vendors to keep the internals of their hardware secret, which they do so by not releasing the source code for the software that links the hardware with the operating system. That’s fine for a closed, proprietary system like Microsoft Windows, the primary market. As Linux users, not in the business of redistributing systems, we would be happy with an add-on driver that works. But, since Linux is a small portion of the market, there is little incentive for hardware vendors to write Linux-specific driver software. And, the software that is available is not always optimized for the Linux kernel, with the result that it either is buggy or skimps on reliability features.
Nvidia has, in the past year, incurred the ire of Linux founder Linus Torvalds for just these issues. The Ubuntu Linux distribution that we run on our systems comes with a more or less generic Nvidia driver. While users can download an updated driver from Nvidia, installation is a bit daunting, requiring the system to be reconfigured for text-mode login in order to rebuild the X11 graphics links. Those of us who have been around long enough to remember hand-tuning the X-Window system configuration files, fingers poised over the “kill” keyboard sequence, ready to shut down X11 in an instant to avoid burning the monitor if the settings were wrong, have grave misgivings about tinkering with the graphics. Plus, the forums on the ‘Net seemed to show that users were having overheating problems no matter what combinations of driver and distribution versions were used. Custom configurations seem to be less than desirable for production machines, so we elected to look for another solution.
A bit more searching on the ‘Net came across the fwts (FirmWare Test Suite) utility package. Once installed, it ran, pointing out compatibility issues between the computer BIOS and the kernel/driver configuration. One of the automatic corrective actions performed by fwts was to switch the operating mode from “performance” to “normal,” which immediately lowered the operating temperature of all the components. The GPU still shows a temperature increase under load, but the fan hardly runs anymore, whereas earlier in the week it was running on high most of the time.
The take-away message here is, updating kernels and/or drivers can and will sometimes result in conflicts with your hardware. Linux has come a long way toward a plug-and-play, run “out of the box.” installation, but it still pays to test and evaluate hardware configurations, just like the old days of Unix. Actually, in the “bad old days” of a few commercial Unix systems, the range of hardware combinations was often very limited, so compatibility issues had been carefully tuned out by the system vendor. But, those systems were expensive. In the Open Source world of Linux, where the system is expected to run on any combination of hardware on the commodity PC market, some outliers are to be expected. For the average desktop Linux user, converting an old Windows machine to Linux will work just fine. But, for “power users” and server applications, some engineering and testing may be required for optimum performance. Certainly, the psensor and fwts software will be an important part of the Linux toolkit from now on.