2008-04-13

On Tools

The starting point for a recent post on Coding Horror about attitudes towards open source and free software is a discussion of programming tools for diffing code or building regular expressions. I won't debate the merits of free v. pay or open-source v. proprietary, since that's a mostly pointless argument and Atwood's readers already beat it into the ground. What I would like to take up is the narrow notion of tools for programming. Some smart aleck posted the following comment:

One more thing--real programmers don't need tools to work with regular expressions.


To which someone responded:

Yeah yeah. Neither do you need IDEs and nice flashy GUIs. Unfortunately "real programmers" are Martians on a totally different plane of existence from "real people", for which usable GUIs are made. We are on the business of helping humans, not Martians, so you "real programmers" can go ahead and marry your console apps for all we care.


I think the response misses the point. I am completely in favor of using tools to make software development easier, faster, and less error-prone. Syntax highlighting, fast access to files in your project, code auto-completion, visual diff tools like WinMerge, one-key access to your language's documentation, all of those are great--the computer's memory and ruthlessly error-free comparison skills are an excellent substitute for a human's, especially when it's late and you're tired or you have repetitive stress injury and can save a few keystrokes.

While I do expect a good programmer to remember most of their language's syntax and common libraries, I see no problem with having to look up the parameter order or the specific name of a method, especially when you're dealing with sprawling, messy beasts like PHP, the Win32 API or Java. What matters is that the developer knows what they want to accomplish and how to make it happen. Using visual tools for building regular expressions, on the other hand, suggests that the developer lacks a fundamental skill that should be part of their core skill set.

Again, I have no problem with someone looking up their language's specific syntax for the "word boundary" character group, or the greed-inversion switch, or any silly implementation detail of that nature. Remembering all of those details is great, but what I look for in a developer is someone who knows how to do things in general and has a sense of what the technology makes available to them. In other words, I'd rather hire someone who needs to look up the greed-inversion switch in PHP but knows it exists than someone who writes massively complicated regular expressions that could be simplified if only they knew about that switch.

For example, let's say you're trying to extract whatever is between tags in an XML document in PHP, and for some reason you don't want to use a parser. You could do something like:

$regex = "/>([A-Za-z0-9_-]+)</";

which means "give me any letter, number, underscore or hyphen between > and <". It's the naive implementation, and it will work most of the time, until you run into, say, an escaped HTML entity starting with a #, at which point you have to go back and fix your regular expression.

Instead, you could do something like this:

$regex = "/>([^<]+)/";

which means "give me everything you find after > that's not a < ". In other words, "give me everything you find until you hit a <".

But perhaps more elegantly you could write:

$regex = "/>(.*?)</";

which means "give me everything you find starting at > but stop as soon as you run into a <", using the greed-inversion switch ? after the *. I find this approach somewhat more satisfying, because it has the "everything between tags" semantics built-in, and it is pretty robust, because it assumes nothing about the input it's expecting (other than "be a reasonably well-formed XML document"). All it takes is the use of an extra switch that's sadly not mentioned in very many tutorials and that I often have to explain to very good, experienced developers.

The point of this digression is not to show off regex-building skills. The point is that a GUI tool for building regular expressions won't help much in this particular situation, because ultimately there's no question that you have to build a complicated thing--the tool only helps you cobble that thing together, but in the end you still have to live with it. There's only so much you can look up about regular expressions, and if you don't understand them or know their capabilities very well, you'll only write mediocre or brittle ones.

The upshot of all this is that tools are perfectly fine, and even encouraged, when they substitute machine reliability for human frailty. But they're certainly no substitute for a solid, well-rounded skill-set, and that well-roundedness requires a lot of painstaking, manual work. Microsoft Word's spellchecker will fix your typos, but it's not going to make you a better writer.

No comments:

Post a Comment