Much of what constitutes best practice in the open-source community is a natural adaptation to distributed development; you'll read a lot in the rest of this chapter about behaviors that maintain good communication with other developers. Where Unix conventions are arbitrary (such as the standard names of files that convey metainformation about a source distribution) they often trace back either to Usenet in the early 1980s, or to the conventions and standards of the GNU project.
Here are some tips on how to get your patch accepted:
Before you send your patch, walk through it and delete any patch bands for files in it that are going to be automatically regenerated once the maintainer applies the patch and remakes. The classic examples of this error are C files generated by Bison or Flex.
Good comments in your code help the maintainer understand it. Bad comments don't.
Here's an example of a bad comment:
/* norman newbie fixed this 13 Aug 2001 */
Here's an example of a good comment:
/* * This conditional needs to be guarded so that crunch_data() never * gets passed a NULL pointer. <norman_newbie@foosite.com> */
A good general form of name has these parts in order:
Please don't use names like these:
This looks to many programs like an archive for a project called “foobar123” with no version number.
This looks to many programs like an archive for a project called “foobar1” at version 2.3.
Many programs think this goes with a project called “foobar-v1”.
The underscore is hard for people to speak, type, and remember.
Some projects and communities have well-defined conventions for names and version numbers that aren't necessarily compatible with the above advice. For instance, Apache modules are generally named like mod_foo, and have both their own version number and the version of Apache with which they work. Likewise, Perl modules have version numbers that can be treated as floating point numbers (e.g., you might see 1.303 rather than 1.3.3), and the distributions are generally named Foo-Bar-1.303.tar.gz for version 1.303 of module Foo::Bar. (Perl itself, on the other hand, switched to using the conventions described here in late 1999.)
It confuses people when two different projects have the same stem name. So try to check for collisions before your first release. Two good places to check are the index file of ibiblio and the application index at Freshmeat. Another good place to check is SourceForge; do a name search there.
Therefore: Use the GNU autotools to handle portability issues, do system-configuration probes, and tailor your makefiles. People building from sources today expect to be able to type configure; make; make install and get a clean build — and rightly so. There is a good tutorial on these tools.
autoconf and autoheader are mature. automake, as we've previously noted, is still buggy and brittle as of mid-2003; you may have to maintain your own Makefile.in. Fortunately it's the least important of the autotools.
Regardless of your approach to configuration, do not ask the user for system information at compile-time. The user installing the package does not know the answers to your questions, and this approach is doomed from the start. The software must be able to determine for itself any information that it may need at compile- or install-time.
But autoconf should not be regarded as a license for knob-ridden designs. If at all possible, program to standards like POSIX and refrain also from asking the system for configuration information. Keep ifdefs to a minimum — or, better yet, have none at all.
If you're writing C/C++ using GCC, test-compile with -Wall and clean up all warning messages before each release. Compile your code with every compiler you can find — different compilers often find different problems. Specifically, compile your software on a true 64-bit machine. Underlying datatypes can change on 64-bit machines, and you will often find new problems there. Find a Unix vendor's system and run the lint utility over your software.
For Python projects, the PyChecker program can be a useful check. It often catches nontrivial errors.
If you're writing Perl, check your code with perl -c (and maybe -T, if applicable). Use perl -w and 'use strict' religiously. (See the Perl documentation for further discussion.)
If you are writing C, feel free to use the full ANSI features. Specifically, do use function prototypes, which will help you spot cross-module inconsistencies. The old-style K&R compilers are ancient history.
Always write your portability layer to select based on a feature, never based on a platform. Trying to create a separate portability layer for each supported platform results in a multiple update problem maintenance nightmare. A “platform” is always selected on at least two axes: the compiler and the library/operating system release. In some cases there are three axes, as when Linux vendors select a C library independently of the operating system release. With M vendors, N compilers, and O operating system releases, the number of platforms quickly scales out of reach of any but the largest development teams. On the other hand, by using language and systems standards such as ANSI and POSIX 1003.1, the set of features is relatively constrained.
Example�19.1 shows a makefile trick that, assuming your distribution directory is named “foobar” and SRC contains a list of your distribution files, accomplishes this.
Include a file called README that is a roadmap of your source distribution. By ancient convention (originating with Dennis Ritchie himself before 1980, and promulgated on Usenet in the early 1980s), this is the first file intrepid explorers will read after unpacking the source.
Notes on the developer's build environment and potential portability problems.
Either build/installation instructions or a pointer to a file containing same (usually INSTALL).
Either a maintainers/credits list or a pointer to a file containing same (usually CREDITS).
Either recent project news or a pointer to a file containing same (usually NEWS).
Note the overall convention that filenames with all-caps names are human-readable metainformation about the package, rather than build components. This elaboration of the README was developed early on at the Free Software Foundation.
The Emacs, Python, and Qt projects have a good convention for handling this: version-numbered directories (another practice that seems to have been made routine by the FSF). Here's how an installed Qt library hierarchy looks (${ver} is the version number):
/usr/lib/qt /usr/lib/qt-${ver} /usr/lib/qt-${ver}/bin # Where you find moc /usr/lib/qt-${ver}/lib # Where you find .so /usr/lib/qt-${ver}/include # Where you find header files
With this organization, multiple versions can coexist. Client programs have to specify the library version they want, but that's a small price to pay for not having the interfaces break on them. This good practice avoids the notorious “DLL Hell” failure mode of Windows.
The de facto standard format for installable binary packages under Linux that used by the Red Hat Package manager, RPM. It's featured in the most popular Linux distribution, and supported by effectively all other Linux distributions (except Debian and Slackware; and Debian can install from RPMs). Accordingly, it's a good idea for your project site to provide installable RPMs as well as source tarballs.
Announce to Freshmeat. Besides being widely read itself, this group is a major feeder for Web-based technical news channels.
Never assume the audience has been reading your release announcements since the beginning of time. Always include at least a one-line description of what the software does. Bad example: “Announcing the latest release of FooEditor, now with themes and ten times faster”. Good example: “Announcing the latest release of FooEditor, the scriptable editor for touch-typists, now with themes and ten times faster”.
Refer to the website examples in Chapter�16 for examples of what a well-educated project website looks like.
An easy way to have a website is to put your project on one of the sites that specializes in providing free hosting. In 2003 the two most important of these are SourceForge (which is a demonstration and test site for proprietary collaboration tools) or Savannah (which hosts open-source projects as an ideological statement).
If you are running a project named ‘foo’, your developer list might be <foo-dev> or <foo-friends>; your announcement list might be <foo-announce>.
An important decision is just how private the “private” development list is. Wider participation in design discussions is often a good thing, but if the list is relatively open, sooner or later you will get people asking new-user questions on it. Opinions vary on how best to solve this problem. Just having the documentation tell the new users not to ask elementary questions on the development list is not a solution; such a request must be enforced somehow.
An announcements list needs to be tightly controlled. Traffic should be at most a few messages a month; the whole purpose of such a list is to accommodate people who want to know when something important happens, but don't want to hear about day-to-day details. Most such people will quickly unsubscribe if the list starts generating significant clutter in their mailboxes.
See the section Where Should I Look? in Chapter�16 for specifics on the major open-source archive sites. You should release your package to these.
Other important locations include:
The Python Software Activity site (for software written in Python).
The CPAN, the Comprehensive Perl Archive Network (for software written in Perl).