Category Archives: Linux

After the Storm: Powering up the Virtual Data Center

There is something to be said for colocation and cloud services, where organizations keep their data and maybe even physical servers in remote facilities that share backup generators, redundant air handlers, multiple network paths, and distributed systems with fail-over redundancy.  But, ultimately, unless you are a true road warrior and connect  wirelessly from coffee shops and airline waiting areas to your exclusively cloud-based or colocated resources, you will need to manage your own network in the face of long-term power outages.

Here at Chaos Central, we have the usual UPS units lurking under the desks, which we largely ignore until the power goes out and they start beeping.  The home office/small office versions of these small units that more or less promise Uninteruptible Power to your computers are best for those momentary power glitches that plague any power grid during uncertain weather or simply human error at the control center: the lights blink, the power supplies beep, and computing goes on.   During winter ice storms and summer heat waves, when everything goes dark for minutes or hours, one of two things happens:

Ideally, you have a flashlight nearby so you can see the keyboards of the servers and workstations well enough to save your current work and run through shutdown cycles (for those machines that don’t have software wired to the power supply that automatically do this) before the batteries run down.  If you have done your system design correctly, you have purchased enough capacity to run the systems for five to fifteen minutes on battery, long enough to shut down the systems gracefully.

On the other hand, if you haven’t provisioned properly, or if you have not paid attention to the age of that black or beige box under your desk, the systems will shut down ungracefully, sometimes in mid keystroke.  Batteries need to be replaced every 3-5 years: the units themselves continue to improve, and the electronics also have been known to fail, so replacing the entire unit is sometimes easier and not that much more expensive.  A good rule of thumb is to get a new UPS when you buy a new computer.  Laptops have a built-in UPS, the battery, so all you need is a surge protector.

In either case–orderly or disorderly shutdown–in an extended outage, the UPS units need to be turned off and everything gets quiet.  When the power is restored, it all needs to be turned back on manually.  (Tip:  if you work from a home office and travel often during “the season,” it is a good idea to have a “network sitter,” someone you trust who can go to your house and turn the critical systems back on after an outage, if you need remote access to your network: we’ve been left “out in the cold” several times over the years, and, yes, have had a network sitter from time to time and it has paid off).

At Chaos Central, some systems come on with the line power, but some need to be manually started as well.  Our main server is a Citrix XenServer, which hosts a variety of systems, some of which are used for network services and some for experiments and development projects, so we leave those to be manually started from the XenServer console.  But, the NFS shares and system image shares need to be connected from the XenCenter GUI, which only runs on Windows.  We keep a Microsoft Windows XP image (converted to VM from an old PC that now runs Linux) in the virtual machine stack for that, but, in order to use it, we have to attach to it from a Linux system that runs XVP.  Finally, after all the network shares, DNS server, and DHCP server are started, we can boot up the rest of the VMs and physical machines.

The next step in the process is to prime the backup system.  Here at Chaos Central, we use rsnapshot to do backups, with SSH agents to permit the backup server to access the clients.  The agent needs to be primed and passphrases entered into it.  The agent environment is kept in a file, accessed by the cron jobs that run the backup process.

ssh-agent > my_agent_env; source my_agent_env; ssh-add

starts the agent and puts the socket id in a file, then sets the environment from the file and adds credentials to it.  Previously, each backup client has had the public key installed in the .ssh/authorized_keys file.  And, of course, since we use inexpensive USB drives instead of expensive tape drives, we need to manually mount the drives: those things never seem to come up fast enough to be mounted from /etc/fstab.   There is a setting in /etc/rsnapshot.conf to prevent writing to the root drive in  case you forget this step…

We also permit remote logins to our bastion server so we can access our files while on travel.  After the system has been off-line for a while, we can’t count on getting the same IP address for our network from our provider, so we have a cron job in the system that queries the router periodically for the current IP address, then posts changes to a file on our external web server shell account.  This also requires setting up an SSH agent.

Now, we are all up and running, usually just in time for the next power outage: here in the Pacific Northwest, winter ice storms usually result in several days of rolling blackouts.  Yesterday was up and down, today has been stable, except for a few blinks that don’t take down the network, so the backups are running and all the services are up.  Our son and grandsons have arrived with a pile of dead computers, cell phones, and rechargeable lighting, having endured two days of continuous blackout in their larger city.  The network is abuzz as everyone jacks in  to catch up on work, news, mail, etc.

It’s a lot of work to run a multi-platform network in a home office, but worth it, when you consider that we didn’t have to scrape snow and ice off the car, shovel the driveway, or brave icy streets and snarled traffic to get to “the office.”

The benefits of telecommuting outweigh the chore of keeping up the network.

Teaching New Dogs Old Tricks: Ubuntu for Unix hacks

Here at Chaos Central, we’ve been wildly excited about our new crop of Ubuntu 11.10 Linux computers from Zareason, the arrival of which was discussed here a couple of weeks ago.  But, thrilled as we are with the speed and capacity of these boxes, there is a learning curve–for the box, not the user.  Unix, and, by extension, Linux, has been evolving for more than 40 years now, and has a huge library of Useful Things accumulated, and a history of competing distributions, each with its own “flavor” and set of favorite tools.

In the beginning, Unix was a toolbox for scientists and engineers to build things quickly and cheaply: initially, document processors, and utility software, and, later, the Internet itself.  The growing popularity of Linux, and particularly the Ubuntu distribution, has driven the need to make it useful for the things “ordinary” people–i.e., those who don’t make a living from tweaking the innards of Big Iron computing–like to do, like surf the net and manage their music, video, and image collections.  And, since Linux is still what to do with your old PC when it gets too bloated with malware and  spyware, the popular distributions still need to fit on a CD.

Most industrial-strength distros now come on DVDS, sometimes more than one, containing the entire Linux collection of free software. but, for the masses, armed with CD-only PCs, something has to give, which often are the venerable, legacy Unix features that most home users will never need.  But, many of those are what we’ve been lugging around on our hard drives for 20 years or more:  venerable editors like emacs and vi, superceded by menu-driven simple editors or integrated graphical development environments but still more powerful and with more features than anyone will ever use, and for which the ones I’ve learned, are extremely useful and well-practiced enough to be second nature, so I keep using them, even if they don’t come with the system anymore.  And, more recently, enterprise-level tools for building massive compute clusters, like GridEngine and MPICH2, along with software development libraries and specialized utility libraries for science and engineering.  We also need a lot more development tools than come with a standard desktop, since we develop software for the web and for high-performance computing clusters.

Fortunately, the Ubuntu software repositories have a lot of those tools packaged up and loadable from the Software Center application, so we don’t need to go through the ritual of downloading, unpacking, configuring, compling, and installing nested sets of dependent programs like the old days.  But, our old computers have accumulated a unique set of software over the four or five years of their busy lives, so the new ones have a lot to learn:  we first had to load the no-longer-included Synaptic Package Manager to grab some of the software libraries and utilities not available in the Software Center catalog.  And, of course, get rid of that silly Unity desktop that only works well for folks who only do one thing at a time with their computers.  We have to have lots of toolbars visible and lots of  workspaces to which we can jump with a single click, which Gnome gives us.

Not surprisingly, some of the more esoteric and least-used packaged software still have a few surprise unresolved dependency issues.  To my delight, GridEngine, a distributed job control system for compute clusters created by Sun Microsystems a dozen or more years ago, was available in the Software Center.  Since Oracle bought Sun a couple years ago, a lot of these tools have disappeared off the free download list at Oracle, folded back into the supported product lines, and the old packages are sometmes hard to find.

GridEngine is one of those transitional systems that, unlike the new applications designed to run under Gnome or KDE desktop management systems, was born and developed during the days of OpenWindows (contemporary with Microsoft Windows 2) and the Common Desktop Environment (CDE, which predates Windows 95), both high-end X-Window System desktop managers in their day.  X11 programs used to be a lot harder to write, and designed for networking more than having the client and server on the same workstation, so they tended to leverage the Unix philosophy of lots of little programs, each doing one thing well, working together, much more than the more integrated and abstracted desktop applications of today.

GridEngine is more likely to be installed on an Ubuntu machine as a client or, at most, an execution node in an ad hoc cluster, rather than a master host, so would not usually run the graphical grid manager, qmon.  But, Open Source being what it is, the whole package is available.  However, the “just works” philosophy of Ubuntu breaks down here, as the dependencies of the archaic and arcane OpenWindows flavor of the graphical component aren’t checked very thoroughly, and there is a bit of a problem.  The application depends on the X11 font server, a client-server application designed to facilitate running X11 clients on a server and displaying them on a different X11 server that might not have all of the requisite fonts loaded.  Also, because CDE relied heavily on licensed Adobe Truetype fonts, the chain of dependency gets broken when it comes to fitting old non-GPL’d software into a Linux distribution.

When you get a GNU/Linux system distribution, everything in it is licensed under the GNU Public License.  You can install anything you want in addition to that, but you can’t package the extras and redistribute it  This also extends to the packaging system.  The Ubuntu Software Center has provisions for adding non-free (i.e., non-GPL) software repositories, but they aren’t going to intereact with each other, so complex packages like GridEngine, that depend on non-free components, come with “some assembly required.”  In my case, someone had already solved the problem for Ubuntu, so a Google search turned up a list of the missing packages and how to install them.  I’ve used GridEngine for years, but on Solaris and RedHat Linux systems.  Solaris, of course, was licensed from Sun (now Oracle) and had full support.  The old Sun GridEngine for Linux packages came with the non-free fonts and dependent packages integrated, because you got them from Sun–they weren’t on any of the five or six CDs (now two DVDs) that comprise the full Red Hat Enterprise Linux (or, as many of us who don’t need any hand-holding support from Red Hat use–CentOS).

So, at Chaos Central, the new dogs are gradually getting housebroken and have largely quit chewing on the furniture, i.e, have learned enough so they–to use another analogy, don’t respond–like HAL9000 from “2001: A Space Odyssey,” “I’m afraid I can’t do that…” when asked to perform tasks the other computers have been doing for years.  They’ve even learned, with larger memory and faster processors, to do new things.

Upgrade Challenges: Avoiding the “Microsoft Tax” and Buying American

The signs that the Great Recession is receding can be found in the return of a tradition taken from the pages of the Christian scripture, the Christmas Shopping Frenzy.  We’ve never understood how the one-paragraph note in the Gospel of Matthew–that describes the arrival of the Magi bringing gifts to the infant Jesus–has created, two thousand years down the vortex of time, a world-wide phenomena, a seasonal gifting orgy of conspicuous consumption that transcends and even obscures the religious symbology of the gifts.  Not to mention that the original story ended badly: the local political power (Herod) subsequently engaged in an horrific massacre of male infants in an attempt to eliminate a perceived threat of competition from the giftee, while the gift-givers fled to avoid interrogation and the target of the pogram was whisked  away to a foreign country.

Here at Chaos Central, we observe the Christmas tradition in a more subdued manner: a small celebration with close family who are practicing Christians, with small gifts, some hand-made, and exchanging end-of-the-year greetings with friends.  And, since, by coincidence, the holiday does happen at the end of the calendar (and tax) year, we do evaluate our balance sheet and make a few last-minute gifts to charities, as well as gifting ourselves with any tax deductible business purchases that were on the near-term planning cycle.  A truly secular side to the “shopping season.”

In this topsy-turvy economy, where millions are unemployed but there is no shortage of merchandise made “off-shore,” and the number two and three brands sell well only because the number one brand sold out during the enforced shopping frenzy, it makes sense to consider making small changes to how we buy things, to reverse the economic trends that have brought us to the brink.  Competition is healthy: the existence of near-monopolies stifles innovation, despite the fact that innovation may have created the monopoly in the first place.  That’s one reason we put off buying computers: There are simply too few alternate choices.  It is much too late to “Buy American,” because very little manufacturing is done in this country anymore.  Almost all computers, whether running Microsoft Windows or Apple OS/X, are manufactured offshore.  And, those are the only choices of operating systems in the big-box stores. But, we can at least buy things assembled in America, and with a choice of open operating systems, if we look for them.

In our case, it is time to upgrade our aging stable of computers, reassigning an 8-year-old workstation now suitable only as a graphics terminal for virtual machines (and not very good at that), and a 4-year-old laptop with limited memory and disk size.  We need a laptop suitable for hosting multiple virtual machines and a full multi-media desktop to adequately support our business projects. These are overdue, postponed while we weathered the deepest and most personal effects of the Recession.  Things fall apart.  Not literally, but the thread of progress gets unraveled when there is no growth due to lack of capital.  [Lest we get confused, capital is not profit–what is wrong with much of America is the failure to invest in capital, in an attempt to keep up the appearance of profit during downturns.]  Computers last upwards of ten years or more if properly maintained (we have one, a Sun workstation, still running after ten years, and 15 and 20-year-old “retired” computers that will still boot up), but they are not cost-effective after four or five years, as they simply cannot do the work to remain competitive against newer systems, and are often not able to run newer software efficiently, if at all.

For some years now, adding a new computer running Unix or Linux to the network at Chaos Central has usually involved buying an off-the-shelf machine and stripping off the unwanted default operating software–i.e., the currently shipping version of Microsoft Windows–or ordering piece parts and building a bare machine from scratch, which is possible for desktop machines but difficult in the case of laptops.  In both strategies, the machine is not ready for use until a suitable replacement environment has been installed. Server-class machines can be (and have been) ordered with no installed operating environment, but the choice of portable systems and compatible desktop workstations has been limited to systems manufactured by Apple, running the OS/X operating system, or a wide variety of other Intel and AMD-based machines–all running Microsoft.  While OS/X is a variant of Unix (the Darwin microkernel port of BSD),  GNU/Linux, Oracle Solaris, and FreeBSD are more commonly compatible with the server systems that we administer and program for clients, so those are what we want on our desk and in our luggage.

Fortunately, there is a large enough market, 20 years into the GNU/Linux revolution, so a number of enterprises have sprung up to build and sell systems that run Linux “out of the box.”  A few major manufacturers, like Dell, did offer Linux choices at one time, but for various reasons–too small a segment of the commodity desktop/laptop business at that time to diversify software choices; and/or problems with Microsoft OEM licensing agreements that applied to product lines rather than individual machines–they dropped the offerings, except for the much smaller and more customized server product lines, in which case they only sell licenses and media: installation and configuration is left up to the buyer.

Small-footprint desktop, preloaded with Ubuntu

However, near west coast port cities, like Seattle, San Francisco, and Long Beach, the ready availability of computer piece parts in economical small lots from tier 1 importers makes it possible for small businesses to build custom non-Microsoft computer systems at nearly-competitive prices.  As it turns out, the market for Linux workstations overlaps with the market for high-end video game machines–with powerful graphics, multi-core processors, and lots of memory–so there is a plentiful supply of components, most of which aren’t found in commodity desktop machines anyway, so the price difference is well within reason.

We like a bargain as well as anyone else, but, as a small home-based business ourselves, we prefer paying a little more, knowing that that extra is providing a living wage to fellow entrepreneurs and folks who love what they do, not boosting the portfolio of an executive as a bonus for outsourcing the entire product and support pipeline to southeast Asia.  We bought our new machines, a high-end laptop and a workstation powerful enough to serve multimedia applications, from Zareason, a small company whose owners we met at Linuxfest Northwest last spring, where we got to check out their offerings.

Buying locally-assembled products isn’t bringing back “Made in America” factories, but it’s a start toward turning a nation of consumers into a nation of producers who take pride in what they make with their own hands and minds.    We’ve written here a lot, recently, about our adventures on our “Made in Oregon” tandem bicycle, and we’re now configuring the next generation of Linux computers, “Made in California.”

Broadcom Wireless on Linux, redux

A couple of months ago, we posted an ongoing saga of getting the Broadcom wireless to work–again–after updating to Ubuntu 9.10 on our Compaq C714NR laptop. It was one of those trial-and-error issues we’ve gone through since we got the machine back in ’07 and first loaded Ubuntu 7.10. But, with 9.10, not only did we get the wireless to work again, but the whole process of connecting with hot spots was simplified. What used to be a grueling test of endurance and exercise in command-line prestidigitation was suddenly as simple as using an Apple, with the new Network Manager applet installed.

Well, all good things must come to an end. Last night, I started Update Manager, which included a kernel upgrade. This morning, the reboot came up with the little antenna icon on the laptop red, and the Network Manager icon showing no signal, wired connection only. Yow!

So, dust off the memory cells and google up the Ubuntu forums. Um, need to reload the bcmwl-kernel-sources. Nope, that causes the machine to freeze on boot. Boot to rescue mode, remove the package (dpkg -r, at a root prompt), then reboot and regroup. Ah, last time, we got the source package directly from Broadcom, compiled it and installed. As usual with any open source product, we ignore the package we already have on the machine and download a fresh one from Broadcom. Sure enough, it had been updated, shortly after we downloaded it last time. Following the README.txt file, we are soon rewarded with the spinning icon, a “connected” splash, and the welcome antenna-with-four-bars. Back online, then a few tweaks to make sure it boots with wireless enabled (copying the driver to the current kernel driver directory).

Lessons learned, or, in my case, relearned, since I’ve known this practically forever (Linux user since 1996, Unix user since 1989):

  1. If you have compiled drivers not included in the distribution, you must recompile and reinstall them each and every time you update the kernel.
  2. When you install any open source package, always check for updates, especially if you have updated your system since you last downloaded it.
  3. Do the above before you reboot your machine after a kernel update, else you may be scrambling for a rescue disk when the machine doesn’t come back up.

Meanwhile. we’re waiting a while before grabbing the new 10.04 Ubuntu upgrade–patch downloads were very slow on Release Day. I hear it is worth the wait. Linux is almost ready for your grandmother’s desktop. Grandma already has hers at Chaos Central, and has since RedHat Linux 7 (she’s running Ubuntu 9.04 x86_64 now), but only because she has a full-time system adminstrator (Grandpa, aka The Unix Curmudgeon).

Ciao, and happy computing with Linux…

Back to Business — sort of

Arrived back in Washington in time to take care of serial sick grandkids, in between looking at mysterious freeze-ups in a client’s HPC cluster (nothing new–it has been an issue through several OS upgrades, an elusive will-o-the-wisp that has existed since $CLIENT == $WORK -> TRUE) and exploring new Linux tools.  New to me, anyway.  Fired up GKrellm, the Linux performance monitor.  Looks much like the old perfmeter tool that has been in Solaris since the OpenWindows days.

Beginning to settle in and get comfortable with Ubuntu 9.10, which has lots of subtle improvements over 8.10, which we’ve used since late 2008. Judy’s workstation is still at 9.04, as I haven’t had time to work out some upgrade issues that need tweaking on the upgrade. Hers is 64-bit, so there are some other issues there, too. We recently upgraded the HP-Compaq C714NR laptop to 2GB of RAM, which really makes a difference in performance, but not quite ready to brave the 64-bit issue with all the wireless issues we’ve had over the years. The Gnome Network Manager is nearly flawless, and gets us on wireless networks painlessly, at least since we resolved the cantankerous Broadcom 4311 driver problems.  Best of all, if it detects a strong network you already have configured, it simply and automatically connects. We still have a bit of a kludge in the wireless driver arena, as we first let the b43 driver load, get the usual “you must update your firmware” message, then run a startup script to unload b43 and load the Broadcom driver, after which all is well. Hmm, time to go look at the firmware issue, as long as we don’t have a road trip planned for almost a month.  I did install b43-fwcutter and fiddle with this earlier, but without much success.  Sometimes us old hardware hackers just need a system that works, to get on with the revenue-producing work, that  doesn’t involve endless tweaking of drivers and firmware.