I’m writing the draft of this post on a word processor (LibreOffice, naturally), for a good reason. We’ve had the same web hosting provider for 17 years, ever since we moved from Missoula to Hamilton in Montana and lost access to ISDN services. For two years before that, we had hosted our own web sites and email from a couple of servers hung from the floor joists, in the basement.
When we needed to find a new home for our Internet presence, Modwest, a new and growing Missoula company, stepped in. The plans they provided were ideal: running on Linux servers, with SSH (Secure Shell) login access and the ability to park multiple domains on one account (at that time, we had two already: parkins.org and info-engineering-svc.com). Everything worked fine, and we added and deleted domains over the years, for ourselves (realizations-mt.com and judyparkins.com) and, temporarily, for clients (onewomandesigns.com). It just worked, and we also added WordPress engines to our personal/business domains. My programming sorted out which domain got served which landing page, and the links from there went to subdomains to keep the filesets separate.
We finally retired the old quilting web site, realizations-mt.com, when the registration expired in early 2018, but rolled the legacy pages into a subdomain of judyparkins.com to keep her client galleries on-line. Then, this spring, Modwest announced they had sold out to Intertune, another web hosting provider headquartered in San Diego. Billing transferred to the new host, with the same pricing. Fine. But, eventually, they told us, the websites and mailboxes would be transferred to the Intertune servers. The Internet thrives on economies of scale—the bigger the organization, the fewer resources are needed for failover, backup, and support.
So, at an inconvenient time (we planned to be out-of-town for a week), they informed us that our parkins.org account would be migrated, so we dutifully switched the collective domains to the new servers, upon which our blogs disappeared, and the other two domains disappeared entirely, along with mail service. Frantic exchanges by phone and email ensued:
Them: “Oh, we’re only migrating parkins.org at this time.”
Us: “But, they share a file set, database, and mailboxes, and have subdomains. It’s one account. And you didn’t migrate the subdomain content at all.”
Them: “Oh, gee, we’ve never seen anything like this. (ed. Note: almost all web hosting services support this.) Switch the others back to Modwest. We’ll get back to you.”
Us: “Unsatisfactory—they are all one and the same, just different names in DNS.”
Us: “Hello? Is anybody there?”
Us: “Our blogs still don’t work, and our mail is scattered across several mail servers.”
Them: “OK, we’ll do what you said, for judyparkins.com and the subdomains.”
Us: “You didn’t. The subdomains sort of work, but the WordPress installation doesn’t, because the three domains are intertwined.”
Them: “OK, try the judyparkins.com now.”
Us: “The blog works, sort of, for Judy’s, but mine doesn’t, and judyparkins.com isn’t receiving mail.
Them: “Oh, wrong mail server. Try it now.”
Us: “OK, now do the same thing for info-engineering-svc.com”
Us: “Hello? Is anybody there? It looks like your servers are pointed the right way, but the email and blogs still don’t work.”
Them: “Oh, wrong mail server. Try it now.”
Us: “OK, the mail works now, but the Larye blog is totally broken: I can’t see blogs.info-engineering-svc.com at all, and the admin page on blogs.parkins.org is broken yet. I can’t publish anything or respond to comments, nada.”
Us: “Hello? Is anybody there?”
Us: “Hello? … Hello?
Now, we could have decided early in this process to not trust them to migrate this successfully and moved everything to a different hosting service, but that would involve a setup fee and transferring all of our files ourselves—which, for the web, isn’t a big deal, but migrating 20,000 emails sitting in hundreds of folders on an IMAP server is—and working through an unfamiliar web control panel, so we didn’t. We should have, as the same level of service, with more space, is actually cheaper on the web hosting service we had before we moved to Montana in 1999.
But, meanwhile, we’re busy, so a few days passed, and they still hadn’t replied to my latest comm check. Rather than risk exposing my latent Tourette Syndrome with an expletive-laden outburst via email, I rechecked their servers to see if the info-engineering-svc.com site was working again. It was, so I switched the domain pointers in ICANN, and waited. The blog still didn’t work, and I sent another, unacknowledged message to say so. But, finally, the info-engineering-svc.com blog started working [no reponse from Intertune, but it’s not magic, so they did something], at least in the display mode. The administrative pages still did not work.
In between guests and visiting relatives, I decided to troubleshoot the WordPress installation, as something was definitely amiss, now that everything was on the same server. But, changes to the configuration files seemed to have no effect, and a check of the error logs showed the same errors, which didn’t reflect the changes I had made. A comparison of the installed file set with my backup I had taken before the migration showed no differences: The light begins to dawn, that my blog installation doesn’t match what Intertune is actually serving.
Sure enough, there is a ~/blogs folder, which I had instructed Intertune to configure as the blogs subdomain, and a ~/www/blogs folder, a different version of the wp-admin file set. After a quick check to make sure that the dozen or two different files in the wp-admin folder were the only differences in the several thousand files in the blog, I copied the version from my backup into the “live” folder, and violá! the admin dashboard appeared, and here we are, composing the rest of the story on-line.
As it turns out, Intertune did not follow my instructions, but did something different, and did not tell me (not for the first time, either). Somewhere in the middle of the migration, my blog installation got updated, and Judy’s did not, when the blogs were split between Modwest and Intertune, so that the update was only partial, breaking the installation when Intertune took it upon themselves to migrate files I had already manually installed, but to a different location.
So it goes. One clue was in the WordPress FAQ, which suggested that a HTTP 500 error might be corrected by re-installing the wp-admin directory, which turned out to be the case. Whether we stay with Intertune or not depends on whether we meet any more difficulty with a tech staff that seems incredibly inept, and how much more work we want to do to move our Internet presence to yet another new hosting service, with new rules, and setup fees. Our old webmail installation no longer works, being reconfigured by Intertune to use their web mail client instead, but we can live with that, and it’s less work for us, though with also less versatility and customization. Now, to finish tweaking all the PHP-language scripts in the webs for compatibility with PHP 5.3.29.
The old saw, “How do you get to Carnegie Hall?” Answer: “Practice, practice, practice,” is so true. All of us are impatient: we just want to “do artistic stuff” and have it turn out like the examples that inspired us in the first place. However, no matter how refined our tastes, our talents take time to develop. How long depends on how much help or critique we get along the way, plus a lot of hard work. What follows is a narrative example of and informal tutorial on making videos on a budget, with inexpensive equipment and open source software.
I’ve always wanted to learn to make video presentations. I imagined I might want to record test flights in the homebuilt airplane project that has languished, unfinished, in my cluttered and sometimes soggy workshop. Another project is documenting our bicycle travels. One obstacle was gear: quality video equipment is expensive. However, all modern digital cameras have a video mode. I started practicing several years ago, strapping my Fuji pocket camera and small tripod to the handlebars of our tandem bicycle, to document rides on the bike trails. It was pretty terrible, amplifying the bumps and roots on the trail and the clicking of the gear shift, as well as not being very well attached, with the camera flopping around from time to time.
The next year, I got a GoPro Hero 3 (Silver–the mid-range model) point-of-view sports camera, and a handlebar mount made for it, a modest investment. The GoPro web site has daily videos sent in by users, showing off amazing feats of surfing, bicycling, motorcycling, scuba diving, parachuting, wing suit plunges, and all manner of dangerous sport, seen from a camera strapped to the forehead, chest, or wrist of the participant. Some were exciting, some just plain scary, but all very professional-looking.
At first, I strapped the new camera on the handle bars of our Bike Friday Tandem Traveler “Q,” turned it on, and set off on a 35-km ride. The result was better than the first attempt with the Fuji, but still shaky, vibrating, and endless. OK, a bit of editing to show some particularly interesting parts, or at least cut out the really boring and really shaky parts. But, a lot of time sifting through gigabytes of footage. I eventually pared the hour and a half of “film” (I only recorded one way of the out-and back ride) down to 11 minutes of not-very-exciting or informative view of lake and woods drifting by at 15km/hr bicycle speed, plus a few moments of 30km/hr downhill bouncing and shaking. The sound track was a muffled one-side conversation between me and my stoker, Judy, on the tandem, plus a lot of road noise transmitted through the frame, and the frequent clicking of the shifters and hissing of the brake shoes on muddy metal rims. A really round-about way of saying “We went for a really satisfying bike ride up the south shore of the lake, and came to nice waterfalls about once an hour. Wait for it.” Fast forward two years of trial and error…
After watching a lot of other people’s videos, and the progression in skill over the years of some of my favorites, like Dutch cycle tourists and videographers Blanche and Douwe, I have possibly picked up some hints of what makes a good video presentation. I mostly publish on Vimeo.com, which offers a set of short tutorials on making videos., but I also recently viewed some good tips by Derek Muller, a science educator who makes a living filming short YouTube videos on various topics in science (Veritasium.com), and Ira Glass, host of National Public Radio’s “This American Life,” who published a series of four short talks on storytelling on YouTube. Both agree that getting good takes practice. Lots of practice. Probably not as much as Malcom Gladwell’s tale of 10,000 hours of solid practice (in “Outliers“). but a lot nevertheless.
The main point of Derek and Ira’s stories is: video is storytelling. As we found, it is not enough to simply record the world as it goes by on your adventure. The result has to tell a story: why you did it, where you went, what it was like, and what you learned, in a concise way that holds the interest of the viewer. I know that most of my efforts failed, because of my viewer numbers on Vimeo. Sometimes both of my loyal viewers watch a particular video, sometimes neither of them do. Obviously something needs work. Submissions to video contests garner a couple hundred views (compared with thousands for the winning entries, and millions for the viral baby, cat, and stupid human tricks videos on YouTube and Facebook), with no idea how many viewers actually watched all the way through. So, we evolved over time, failure after failure.
First, I got rid of the “native sound,” because what the camera mic picks up isn’t what I focus on or even consciously hear while riding. Instead, I find a piece of music that I think reflects the sense of motion and emotion in the ride, or one that at least fits the length of the film, or that I can cut the film to fit without making the visual too short or too long. A fast ride needs a beat reflective of the cadence; beautiful scenery or glorious weather deserves a stirring orchestration or piano number, a matter of taste. The next step is to trim the video clips to match the phrasing of the music, if possible.
I realized that, though I find looking forward to what is around the next bend exciting while on the bike, watching endless scenery flow by on the small screen isn’t particularly engaging. Most of other people’s videos I enjoy have clips (scenes) of 7-10 seconds each. Mine ran generally 20 seconds to several minutes. Boring. Furthermore, long takes don’t necessarily advance the story line, just as important to film as to the page, unless there is some interesting progression unfolding in the clip, much as a detailed sex scene in a novel is only necessary to define a key point in the development of the relationship between the characters: mostly, it is sufficient for the characters to retreat to the privacy of the bedroom behind a line of asterisks, as a transition between scenes. A video fade on a long stretch of empty road to the next bend suffices just as well. We’re not promoting “bike porn” here: no matter how much we personally enjoyed the ride. I’m beginning to appreciate the need for story-telling that doesn’t fall into the “shaggy dog” genre, i.e., drawn-out and pointless–suspense to boredom without a satisfying punch line.
Picking music is another issue. At first, I shuffled through my library of ripped CDs (no piracy here, just a convenient way to carry your music library with you, on the hard drive of your computer instead of a case full of plastic in a hot car). However, even if the audience is small (i.e., myself and others in the same room), such usage violates the copyright on commercial recordings, especially on a public post on the ‘Net. I’ve recently started re-editing some of the early videos I made this way, substituting from my new library of royalty-free music published under a creative commons license, and downloadable from several sites on the Internet, notably www.freemusicarchive.org, where musicians leave selections of their work as a calling card or audio resume, hoping for commission work or performance gigs, or to sell physical CDs in uncompressed, high-fidelity audio instead of downloading the lower-quality MP3 lossy compression version.
This is the wave of the future in a world where digital copies are easy: whether you buy a copy or get one free, play it, listen to it, use it to enhance your art, just don’t resell it whole. That’s the idea behind creative commons. Unfortunately, much of music publishing is still in the “for personal use” only, and if the pressed recording gets scratchy, buy another one, no “backup” copies allowed, and no sharing with friends: if you want them to hear a song, invite them to your house or take your iPod over to theirs: you can’t stream it or email it or share a copy on the cloud. ASCAP blanket licensing for broadcast or use in video is still on a corporate price scale, intended for production studios and well beyond the reach of a PC user who just wants to add her favorite pop tune to a video of her and her friends having fun. So it is that Kirby Erickson’s ballad of driving up US93 through the Bitterroot as background to a bike ride up US93 through the Bitterroot is gone, so viewers who aren’t familiar with his work won’t be tempted to buy the album the song came from, because they won’t ever hear of it. Restrictive licensing actually potentially reduces sales in the Internet age. By now, you can’t even upload videos if they have copyrighted music audible in them–Facebook, for one, matches audio signatures from video against a sound library and blocks them.
Although I see some improvement in quality in my amateur videos, I still have a long way to go. For one, the handlebar mount for the GoPro camera introduces too much vibration, so the picture is hard to watch, and doesn’t reflect the experience of riding. Some damping is needed. We did get some better results with the camera mounted on our trailer, but we only use that when touring. Some sort of counterweight to produce a “steady-cam” effect might work here, as the “real thing” is expensive and a bit bulky.
The story is more interesting when it shows the participants, which, for us, means using the trailer mount or some sort of “selfie stick” to put the camera to the side or front, or, as I did in one clip, turn the camera around briefly. During my convalescence from heart surgery last summer, we did a lot of hiking, where I devised a selfie-stick approach to give the impression of the viewer being with us instead of sharing our point of view. I’m a bit happier with some of those, particularly the ones where the camera boom isn’t in the view. More practice, and experimentation. I’ve been more satisfied with ones where we’re in the shot only when necessary to tell the story (an essential point, when the story was that I was OK, and getting better), and the scenery out in front when it was the story.
Now, the issue is to trim the scenes to the essential elements (who, what, where, when, why, and how). To that effect, part of the re-editing process to replace the audio tracks involves cutting the video to synchronize with the sound track phrasing, as well as reducing the length to the minimum necessary. To paraphrase E.B. White’s dictum on writing, “Omit needless frames.”
One of the issues with being the director, cameraman, and actor all at once is to keep the bicycle safe while planning the shot and operating the camera, as well as keeping the mission (travel) moving along. We miss some good shots that way, but it is inevitable. One popular technique is to set up the camera along the route and show the bicycle or hikers approaching or receding across or into the frame, which involves stopping and staging the shoot. This is less intrusive where there are two or more cyclists, so it is a matter of setting up the shot ahead of or behind the other rider(s), but that isn’t an option with the tandem, and we’ve used it in limited fashion by propping up the monopod/selfie-stick along the trail. We do have several sizes of tripods, but they aren’t convenient to carry when the photography is incidental to the main purpose of travel. I’ve long since taken to filming short takes “on the fly” rather than just leaving the camera on to pick up everything, which involves anticipating some scenery reveals or events, and, of course, missing some. But, editing “on the fly” to limit scenes does shorten the editing process and save battery life on the camera. We work with what we have.
Recently, we entered a video contest for a short travel documentary on the Newberry National Volcanic Monument, in central Oregon, which seemed to demand some dialogue in addition to the usual soundtrack and titles, so we experimented with voice-over to add a short narration where appropriate. This also wasn’t the best, since our microphone is the headset-attached variety, suitable for making Skype phone calls and video chats, but little else. Good quality condenser microphones for the computer and lapel microphones compatible with the GoPro are simply not in the budget, along with professional video cameras with microphone jacks or built-in directional microphones. Drones are all the rage, now, but one suitable for carrying a GoPro as a payload is stretching the budget, also, and presents safety and control issues for use in our primary video subject, i.e., bicycle touring and trail riding.
Besides finding a story in a video clip sequence, getting the story to flow smoothly, and finding an appropriate sound track to evoke the mood of the piece, the skill set also involves learning to use video editing software. Microsoft Windows comes with a decent simple video editor, but we don’t use Windows. We do have iPads, which have apps for making videos, but haven’t spent a lot of time on those, which also limit one’s ability to import material from multiple sources (the apps work best with the on-board iPad camera). There are a number of contenders in the Linux Open Source tool bag, some good, some complex. We chose Open Shot, a fairly simple but feature-full non-linear video editor, which gives us the ability to load a bunch of clips, select the parts we want, and set up multiple tracks for fades and transitions and overlays of sound and titles. We also found that the Audacity audio recording and mixing software can help clean up the sound from less-than-adequate equipment. ImageMagick and the GIMP are still our go-to tools for preparing still photos to add to the video. Open Shot uses Inkscape to edit titles and Blender for animated titles.
Video is memory and CPU-intensive, so it helps to have a fair amount of RAM and a fast multi-core CPU (or several). Our main working machine, a Zareason custom Linux laptop, has 8GB of RAM, an Nvidia GeForce Graphics Processor Unit, and a quad-core dual-thread CPU, which looks like an 8-processor array to Linux. This is barely adequate, and often slows down glacially unless I exit from a lot of other processes. The more clips and the longer the clips, the more RAM the process uses; often the total exceeds the physical memory, so swap space comes into play. I’m usually running the Google Chrome browser, too, with 40-50 tabs open, which tends to overload the machine all by itself.
This isn’t something you could do at all on a typical low-end Walmart Windows machine meant for browsing the ‘Net and watching cat videos on Facebook and YouTube, so investing in a professional-quality workstation is a must. Since we travel a lot and I like to keep our activity reporting current, that means a powerful laptop machine, running Unix, OS/X, or Linux. Fortunately, our laptop “strata” is in that class, though only in the mid-range, a concession to the budget as well as portability. We purchased the machine when we were developing software to run on the National Institutes of Health high-performance computing clusters, and is roughly comparable to a single node in one of the handful of refrigerator-sized supercomputers in the laboratories that have several hundred CPU cores and several dedicated GPU chassis each.
In addition to Open Shot, we also sometimes use avidemux, a package that allows us to crop and resize video clips so we can shoot in HD 16:9 wide-screen format and publish in “standard” 4:3 screen format if necessary, or crop 4:3 stills and video automatically to 16:9 format to use with other HD footage. In addition to the GoPro, we now have a new FujiXP pocket camera that can shoot stills and video in 16:9 HD, and a Raspberry Pi camera unit, that is programmable (in Python), that we use for low-res timelapse and security monitoring. The programmable part means we write automated scripts that select the appropriate camera settings and frame timing and assemble a series of still photos into a timelapse movie, using the Linux ffmpeg command-line utility.
So it goes–gradually, the videos we turn out get slimmer and more to the point, if not technically better quality, something we need to work on constantly with prose as well, as an intended 500-word blog post ended up a 3000-word tutorial instead.
Today is the 25th anniversary of the birth of the World Wide Web, which is known today simply as “The Web,” or, even more generally, “The Internet.” The “‘Net,” of course, is much more, encompassing email and communications other than HTML, but for most people, the Web is their main contact.
Tim Berner-Lee didn’t invent the Web out of thin air, though. During the period of proliferation of personal computers in the early 1980s, the diversity of languages, applications, and data formats made it difficult to exchange data between different systems. An early attempt to make a universally-readable data file was in the GEM (Graphical Environment Manager) system, which initially ran on top of CP/M, one of the first microcomputer operating systems. GEM was based on research at the Xerox Palo Alto Research Center (PARC).
The “killer app” that resulted in GEM-like interfaces being ported to Apple and MS-DOS was Ventura Publisher, an early desktop-publishing software system that promoted WYSIWYG (What you see is what you get) editing. The data files were plain text, “decorated” with markup language “tags” to identify elements like paragraphs, chapter and section headings, and all typographical marks that editors normally penciled in, hence “markup.” The first standard in this was SGML, or Standard Generalized Markup Language, and style manuals, most notably the one from the Chicago Press, were published to promote standardized markup notation. SGML really didn’t catch on, though, as popular word processors of the time, such as WordPerfect and Microsoft Word and desktop publishing software like Aldus Pagemaker (now owned by Adobe) retained their own binary document formats. GEM and Ventura Publisher became buried in history with the demise of CP/M and the introduction of the GEM look-alike Apple Macintosh and the decidedly inferior Microsoft Windows graphical desktops that took center stage.
Meanwhile, in the Unix world, networking and inter-networking was the focus, along with a graphical desktop environment, the X Window System. The X Window System was too cumbersome and tied to Unix to be used over the relatively slow inter-networks and with the now predominant IBM PC running MS-DOS and Microsoft Windows or the niche Apple environment. Meanwhile, Berners-Lee was experimenting with a subset and extension of SGML called HyperText Markup Language (HTML). Hypertext introduced not only markup for the appearance of rendered text, but the ability to cross-reference one section of text from another, even between documents. Hyper-threaded text concepts were known at least since the 1940s, and used to create alternate endings or story lines in printed fiction, but computers made it possible to achieve smooth navigation between paths.
At the same time, networks were becoming mature, with the introduction of the Domain Name System and the TCP/IP network protocol in an internetworking scheme that provided unique names and addresses for every computer in the network, on a global scale. Incorporation of DNS names and local paths in hypertext references made it possible to connect any document on any computer with references stored in different files on the same or any other computer in the network. To implement these connections, a new network protocol, HyperText Transport Protocol (HTTP) was developed, and the World Wide Web was born in a flash of inspiration.
To complete the system required software that could speak HTTP and render HTML into formatted text and links on a computer screen, a server, and a client, which came to be known as a web browser. The first web server was on a Next (Unix) machine, and early browsers were text-only, as the Internet was still largely based on dial-up modem access over ordinary phone lines. But, with the increasing use of graphical displays, browsers also became graphics-capable, adding imaging and other visual effects to the HTML palette.
Today, 25 years later, most web servers are running some form of Unix or Unix-like operating system, mostly the open-source GNU/Linux system, though Microsoft Internet Information Service (IIS) runs many corporate internal services and public sites. Browsers are now graphical, either Microsoft Internet Explorer or based on the Mozilla project, an open-source release of the original Netscape code derived from Mosaic, the first graphical browser.
The Web itself has evolved, with the addition of scripting capabilities on both the server and the client browser to create dynamic pages created at the moment, tailored to changing data, with the ability to refresh elements on a page as well as the whole page, and the ability to stream video and audio data as well as display text and images. Indeed, HTML version 5 replaces many of the commonly-scripted constructs with simple markup tags that modern browsers know how to render. The advent of social media sites to connect multiple people in real-time as well as provide private messaging and real-time chat has largely replaced the spam-plagued email system for many people, and brought the promise of video-phone connections to reality.
The Web, in 25 years, has transformed society and our technology, nearly replacing newspapers, magazines, the telephone, and the television and video media player. Even the personal computer has been transformed: the advent of software as a service (SAAS) means that users no longer have to purchase expensive applications to install on one computer, but can “rent” the use of an application on a distant server through the web browser, and rent storage space in “the cloud” that is available on any computer, even small hand-held devices like tablets or phones. The web has also made possible the concept of wearable computers, such as Google Glass. The World Wide Web not only covers the planet, but beyond (with live feed from the International Space Station and the Mars rovers), and has infused itself into our experience of reality.
One of the conundrums of the past month at Chaos Central has been the problem of making changes to a web site for which the Unix Curmudgeon is the content editor but not the programmer. The site, a few years old, was a set of custom pages with a simple content editor that allowed the web editor to create, update, or delete some of the entries on some of the pages. The tool allowed photos to be inserted in some of the forms, and some of them were in calendar format, with input boxes for dates. The problem was updating documents in the download section, for which there was no editing form. This was becoming an acute problem because the documents in question tend to change year to year.
At first, the solution to the problem seemed to be to modify the source code to add the required functionality, which involved getting the source code from the author and permission to modify it, something we had done before to change a static list page to read from a tab delimited file. But, the types of changes didn’t always fit with the format of the administration forms and still required sending files to the server administrator, as we didn’t have FTP or SSH access to the site. Then, we noticed that the web hosting service had recently added WordPress to the stable of offerings. The solution was obvious: convert the entire site to WordPress. Of course, the convenience of fill-in-the-blank forms would be gone, but we would have the ability to create new pages, add member accounts, create limited-access content, and upload both documents and photos.
The process was fairly simple: using the stock, standard WordPress template, the content of the current site was simply copied and pasted into new pages, and the site configured as a web site with a blog rather than the default blog with pages format. Some editing of the content to fit with the standard WordPress theme style models, and juggling the background and header to fit with the color scheme and appearance of the old site, and incorporate the organization’s logo in the header, and it was done: the system administrator replaced the old site with the new, with appropriate redirection mapping from the old PHP URLs to the corresponding WordPress pages. This migration represented yet another step in the evolution of the web, or, more properly, in our experience with on-line content.
In the beginning, there was the concept of markup languages. My first encounter with such was in the mid 1980s with the formatting tags in Ventura Publisher, which was the first desktop publishing tool, introduced in GEM (Graphical Environment Manager), a user interface orginally developed for CP/M, the first microcomputer operating system, preceding MS-DOS by a few years (MS-DOS evolved from a 16-bit port of the 8-bit CP/M). Markup tags grew from the penciled editing marks used in the typewriter age, by which editors indicated changes to retype copy: Capitalize this, underline (bold) this, start a new paragraph, etc. In typesetting, markups indicated the composition element, i.e., chapter heading, subparagraph heading, bullet list, etc, rather than specific indent, typeface and size, etc. In electronic documents, tags were inserted as part of the text, like <tag>this</tag>. Where the tag delimiters needed to be installed in the text, they were described as special characters, like >tag< to print “<tag>” (which, if you look at the page source for this document, you will see nested several layers deep, since printing “&” requires yet another “escape,” &).
One of the reasons for the rise of markup languages as plain-text tags in documents was the proliferation of software systems, all of which were incompatible, and for which the markup tags were generally binary, i.e., not human readable. Gradually, the tags became standardized. When the World Wide Web was conceived, an augmented subset of the newly-minted Standardized Generalized Markup Language (SGML) was christened HyperText Markup Language (HTML). HTML used markup tags primarily to facilitate linking different parts of text out of order, or even enable jumping to different documents. Later, the anchor (<A>) tag and its variants were expanded to allow insertion of images and other elements.
Of course, these early web documents were static, and editors and authors had to memorized and type dozens of different markup tags. To make matters worse, the primary advantage of markup tags, the identification of composition elements independent of style, became subverted as browser software proliferated. In the beginning, the interpretation of tags was controlled by the Document Type Definition (DTD), the “back end” part of the markup language concept. The DTD is a fairly standard description of how to render the tags in a particular typesetting or display system. Each HTML document is supposed to include a tag that identifies the DTD for the set of tags used in the rest of the document. But, since different browsers might display a particular tag with different fonts–size, typeface, color, etc.–the HTML tags allowed style modifiers to specify how the particular element enclosed by that one tag would be displayed. This not only invites chaos, i.e., allows every instance of the same tag to be displayed differently, but most word processors, when converting from internal format to HTML, surround every text element with the precise style that applies to that text element, making it virtually impossible to edit for style in HTML form. To combat this proclivity toward creative styling, the Cascading Style Sheet (CSS) was invented, allowing authors and editors to globally define a specific style for a tag, or locally define styles within a cascade, by using the <DIV> and <SPAN> tags to define a block of text or subsection within a tag.
In order to use the Web as an interactive tool, and an interface for applications running on the server, it was necessary to augment the HyperText Transmission Protocol (HTTP), the language that the server uses to process requests from the browser, to pass requests to internal programs on the server. This was implemented through the Computer Gateway Interface (CGI — not to be confused with Computer Generated Imagery used in movie-making animation and special effects). Originally, it was necessary to write all of the code to generate HTML documents from the CGI code and to parse the input from the browser. But, thanks to a Perl module, CGI.pm, written by renowned scientist Dr. Lincoln Stein, this became a lot easier and established the Perl scripting language as the de facto web programming language.
But, as the Web became ubiquitous, most content on the web was still static HTML, created by individuals using simple HTML tags in a text editor or saving their word processing documents as HTML, or using PC-based HTML editors like Homesite or Dreamweaver. Adding interactive elements to these pages required them to be rewritten as CGI programs that emitted (programmer-eze for “printed”) the now-dynamic content. By now, however, web servers had incorporated internal modules that could run CGI programs directly without calling the external interpreter software and incurring extra memory overhead. By adding special HTML tags interpreted by the server, snippets of script code could be added in-line with the page content, making it much easier to convert static pages to dynamic ones. Since the primary need was to add the ability to process form input, this led to the development of specialized server-side scripting languages, such as Rasmus Lerdorf’s PHP.
But, now that creating web pages was becoming a programming task more than a word-smithing task, there was a need for better authoring tools. The proliferation of PHP and the spread of high-speed Internet access made it more feasible to actually put interactive user applications on web servers. Web editing moved from the personal computer to the Web itself, as the concept of content management systems took hold. Early forms were Wikis, where users could enter data with Yet Another Markup Standard, that would be stored on the server and displayed as HTML. More free-form text form processors followed, making possible forums of dialogue and whole web formats for group interaction, using engines like PHP-Nuke and others, that used a database back-end to store input and PHP to render the stored content and collect new content. The expansion of the forums into tools for diarists and essayists in the form of blogs (from weB LOG) led to development of even more powerful content management systems, like Joomla and WordPress, capable of developing powerful web sites without programming.
So, we have progressed in evolution from desktop publishing to the Web, to interactive applications, to converting static sites to dynamic ones, and finally, to converting custom programs to templates for generalized site-building engines. The Web, through new web forums for social interaction between friends and relatives who have never seen raw HTML code, allows ordinary folks to converse with friends and relatives across the world, to post photos, videos, and links to other sites of interest, just as the original hypertext designers intended. What seemed arcane and innovative thinking 30 years ago is now just another form of natural human interaction.
But, for those of us who make our living interpreting dreams in current technology, the bar moves up again. As we no longer think about the double newline needed at the beginning of every HTML document after the “Content-type” line and before the <HTML> tag, which is the first code emitted from a CGI program or from the server itself, we no longer need to write CSS files from scratch or PHP functions to perform common actions. But, we need to learn the new tools and still remember how to tweak the code for those distinctive touches that separate the ordinary from the special. And, there are still lots of sites to upgrade…
Having been a Unix systems administrator for about 20 years and “in the IT business” for more than twice that time, I have always been aware that that tedious task of backing up your data can pay off in the end. As we say, there are two types of computer users: those who have lost valuable data and those who will lose valuable data. But, backups can help.
The other day, while between crises at Chaos Central, our home/office/workshop, I was browsing through the web statistics for our public sites and actually paid attention to the list of ‘404’ (File Not Found) errors. Now, a lot of these are the usual and customary web site attacks that go on all the time, “bad guys” probing for security holes that a) don’t exist because we are not using Microsoft Internet Information Server, or b) have long since been plugged in Apache, so I more or less ignore those, along with the blind probes for informational pages to gather email data to feed SPAMmer’s address lists. But, since I have been using Google’s Webmaster Tools to aid in Search Engine Optimization (SEO) for my clients, I’ve paid a bit more attention to little things, like having a ‘robots.txt’ file even if I don’t have any pages I want to hide from the search spiders. Another one is the ‘favicon.ico’ file: you don’t need one, but browsers always ask, so it is nice to have one. Besides, it adds a distinctive icon to the address bar, tab, or ‘favorites’ list, so I have been creating those tiny graphics and adding them to sites as well.
But, this time, I also noticed missing pages where there shouldn’t have been any. Oh-oh. Until sometime last year, we had a number of links from our public server pages to our ‘extranet’ server located in our home office and port-mapped to our external router. Some issues with our Internet service at home and our impending move made it imperative to move these pages onto our public servers, which I more or less did, in bulk, to a virtual server, and changed all the links to point to the new location. I had thought I had tested all of the links, but here we were, many months after the transition, with a broken link. An entire page and all of its images, gone.
Meanwhile, the old extranet computer had not survived the household move. After sitting in a cold, damp basement for a month or so last fall, it refused to start. But, we still have the last set of backups! So, I extracted the files from the backup archive and uploaded them to the web site.
For the record, we are, of course, a Unix shop, running mostly Solaris and Linux, so we use Amanda, the open-source backup system from the University of Maryland, and backup to virtual tape on cheap USB drives. The drives are large enough to keep several weeks’ worth of backups, and we archive the last backup set from “retired” machines. For machines like my Linux laptop, that aren’t necessarily on or even in the network 100% of the time, I use rsync, freezing a checkpoint now and then, since we do upgrade Ubuntu versions, and we keep a pre-upgrade backup as well as current ones. Because the Amanda backup runs daily, but we do a lot of work during the day, we also keep a snapshot backup (using rsync) every couple of hours during the workday on our main server/workstation. For our one seldom-used physical Windows machine, we simply copy our working directories onto a file share on one of the Unix systems.
So, there are many ways to do backups, but it is important to do at least one of them, whether or not it seems to be a pain–it will pay off. Oh, yes, I am in the group of users who have lost data–in my case, I archived data, then failed to make a second copy before the orginal archive failed. And, just because I’ve lost data once, doesn’t mean I won’t again, backups or no backups. But, keeping good and current backups does reduce the chances that it will be soon or extensive.