One pattern that emerges from the examples we've examined so far in this chapter is this: If you want transparent code, the most effective route is simply not to layer too much abstraction over what you are manipulating with the code.
In Chapter�4's section on the value of detachment, our advice was to abstract and simplify and generalize, to try and detach from the particular, accidental conditions under which a design problem was posed. The advice to abstract does not actually contradict the advice against excessive abstractions we're developing here, because there is a difference between getting free of assumptions and forgetting the problem you're trying to solve. This is part of what we were driving at when we developed the idea that glue layers need to be kept thin.
One of the main lessons of Zen is that we ordinarily see the world through a haze of preconceptions and fixed ideas that proceed from our desires. To achieve enlightenment, we must follow the Zen teaching not merely to let go of desire and attachment, but to experience reality exactly as it is — without the preconceptions and the fixed ideas getting in the way.
This is excellent pragmatic advice for software designers. It's part of what's implicit in the classic Unix advice to be minimalist. Software designers are clever people who form ideas (abstractions) about the application domains they deal with. They organize the software they write around those ideas. Then, when debugging, they often find they have great trouble seeing through those ideas to what is actually going on.
Any Zen master would recognize this problem instantly, yell “Three pounds of flax!”, and probably clout the student a good one.[63] Consciously designing for transparency is a slightly less mystical way of addressing it.
In Chapter�4 we criticized object-oriented programming in terms likely to prove a bit shocking to programmers who were raised on the 1990s gospel of OO. Object-oriented design doesn't have to be over-complicated design, but we've observed that too often it is. Too many OO designs are spaghetti-like tangles of is-a and has-a relationships, or feature thick layers of glue in which many of the objects seem to exist simply to hold places in a steep-sided pyramid of abstractions. Such designs are the opposite of transparent; they are (notoriously) opaque and difficult to debug.
Transparency and discoverability, like modularity, are primarily properties of designs, not code. It is not sufficient to get right the low-level elements of style, such as indenting code in a clear and consistent way or having good variable-naming conventions. These qualities have much more to do with code properties that are less obvious to inspection. Here are a few to think about:
Does the code have invariant properties[64] that are both strong and visible? Invariant properties help human beings reason about code and detect problem cases.
Are the function calls in your APIs individually orthogonal, or do they have too many magic flags and mode bits that have a single call doing multiple tasks? Avoiding mode flags entirely can lead to a cluttered API with too many nigh-identical functions, but the obverse error (lots of easily-forgotten and confusable mode flags) is even more common.
Are there a handful of prominent data structures or a single global scoreboard that captures the high-level state of the system? Is this state easy to visualize and inspect, or is it diffused among many individual global variables or objects that are hard to find?
Is there a clean, one-to-one mapping between data structures or classes in your program and the entities in the world that they represent?
Is it easy to find the portion of the code responsible for any given function? How much attention have you paid to the readability not just of individual functions and modules but of the whole codebase?
Does the code proliferate special cases or avoid them? Every special case could interact with every other special case; all those potential collisions are bugs waiting to happen. But even more importantly, special cases make the code harder to understand.
How many magic numbers (unexplained constants) does the code have in it? Is it easy to discover the implementation's limits (such as critical buffer sizes) by inspection?
It's best for code to be simple. But if it answers these sorts of questions well, it can be very complex without putting an impossible cognitive burden on a human maintainer.
The reader might find it instructive to compare these with our checklist questions about modularity in Chapter�4.
Another theme that emerges from these examples is the value of programs that flip a problem out of a domain in which transparency is hard into one in which it is easy. Audacity, sng(1) and the tic(1)/infocmp(1) pair all have this property. The objects they manipulate are not readily conformable to the hand and eye; audio files are not visual objects, and although images expressed in PNG format are visual, the complexities of PNG annotation chunks are not. All three applications turn manipulation of their binary file formats into a problem to which human beings can more readily apply intuition and competences gained from everyday experience.
All the advantages of textual data-file formats that we discussed in Chapter�5 also apply to the textual formats that sng(1), infocmp(1) and their kin generate. One important application for sng(1) is robotic generation of PNG image annotations by scripts — because sng(1) exists, such scripts are easier to write.
Writing a textualizer or browser is a valuable exercise for at least four reasons:
After you've done this, you may well discover that it's possible to apply the “separated engine and interface” pattern (see Chapter�11) using your textualizer/debugger as the engine. All the usual benefits of this pattern will apply.
[64] An invariant is a property of a software design that is preserved by every operation in it. For example, in most databases it is an invariant that no two records may have the same key. In a C program that correctly manipulates strings, every string buffer must contain a terminating NUL byte on exit from each string function. In an inventory system, no parts count can hold a number less than zero.