RPM:Philosophy

From Vermont Area Group of Unix Enthusiasts

Based on a post by Sam Hooker to the VAGUE mailing list.

The original question, paraphrased, was, "In 2007, what is the state of RPM-based package management, and how will it help me get things done?"

RPM is the package management system that automates the configuration, installation and un-installation of software packages for a computer operating system. There exist other noteworthy package managers, but RPM is relatively popular one.

Contents

Why RPM?

If you're managing more than a handful of machines that need regular updates to operating system and application software -- which means anything serving more than a research/development function -- then any package management at all is crucial. RPM, in and of itself, is adequate for such a role. It is easy to use for installation and removal of packages, easy to query, and is relatively straightforward to build new packages or to alter existing SRPMs (source packages). RPM incorporates GPG-based code signatures and has a mature path for turning source code into packages. The latter means people have been building RPMs for long enough that the process is well-documented, stable and consistent.

RPM by itself improves the situation of compiling from source, but not the hassle of chasing down the necessary dependencies of a package. This was solved by marrying a meta-package-management "coordinator" like YUM.

Enter YUM

Without something like YUM, you download RPM packages, try to install them, and assuming your system is "RPM compliant" (more on this, later), they either install, or the package manager complains about any absent dependencies -- other software packages or libraries that need to be installed for the package to work. If requisites (any necessary binaries or linked libraries) are missing, you need to find the right packages, download them, attempt to install, "lather, rinse, repeat". If there are no such packages, then you're hunting for and compiling packages from source to satisfy these dependencies. This usually is only an issue if you're trying to either install an RPM under a distribution other than that for which it was built, or otherwise an RPM by someone who's not altogether forward-thinking.

With YUM (or similar), the package meta-data that defines things like dependencies are divorced from the packages themselves. When you run 'yum install blah', YUM doesn't download, say, blah-2.6-19.i386.rpm, but blah-2.6-19.i386.rpm.hdr (the aforementioned meta-data), and analyzes it for dependencies. If there aren't any, it downloads and installs blah-2.6-19.i386.rpm. Case closed.

If there are dependencies -- i.e., libfoo-0.7b2.rpm -- it downloads libfoo-0.7b2.rpm.hdr, analyzes that for dependencies, and reacts appropriately. The lather, rinse, repeat, is completely automated here. Once YUM is finished resolving dependencies, in interactive mode it presents you with a list of all the RPMs it needs to fetch and install to fulfill your original request. Once you approve, it performs the requested installations.

YUM, nowadays, has a modular configuration architecture that makes it trivial to add a "repository" of packages set up specifically for use with YUM. This includes not only the official packages for the distribution, but third-party RPMs as well (Dag Wieers, RPMforge, freshrpms, and ATrpms are among the big names) where you can frequently find RPMified versions of cool packages that your distribution left out or doesn't have yet. YUM can also conveniently list all the packages available in the repositories for which it's configured, and check for updates to your other software installed with RPM. It caches all this information locally until you explicitly clear it, such that queries after the first one are comparatively fast.

Dag Wieers (although it may be at RPMforge, now?) created a package called mrepo that makes it reasonably quick and convenient to mirror your favorite YUM repository (or repositories) to a local machine. You can then configure all your machines at that site to YUM against the local mirror, saving yourself 15 downloads of the latest kernel and glibc packages when updating your 16 Fedora machines.

The catch

There's always a catch, isn't there?

The catch is probably the case for any package management system -- although others might be easier work around this than RPM. In order to expect the maximum return on RPM, you pretty much want to "drink the Kool-Aid" and drink all of it. You will want to use an RPM version of your desired package when one is available, and satisfy its expectations of where to find files (its own AND those of its dependencies). Failing that, you'd probably want to download its SRPM, change its specification file -- the file that determines what information gets fed to 'configure', 'make', 'install' and what files to include in the package, and then you run 'rpmbuild' to make your own package.

The danger in "going off the reservation" with RPM is that, irrespective of the convenience you experience today in compiling httpd (the Apache Web server) from source and placing everything in /opt/apache, someday in the foggy mists of "The Future", there will be a time when you want to install mod_auth_mysql from RPM. At this point, mod_auth_mysql-x.y-z.rpm is going to expect to find things in /etc/httpd -- where the httpd RPM would have put it, and when they're not located there, mod_auth_mysql will puke and enrage you.

This rage, however, will be nothing compared to the fury you will experience on "Day Four of your Compiling mod_auth_mysql from Source Extravaganza". The first two days of which were spent divining the fact that there are, in fact, multiple forks of the mod_auth_mysql code, only one of which really works with Apache 2.x. This will cause you to resent package management, Linux, Linus Torvalds himself, your Mom, and the Pope.

No one needs that.

Finding religion

This is not to say, of course, that there won't be times during your happy residency in "RPM-ville" when you must revisit "Roll-Your-Own Land". It just means that you have something of a responsibility to support your expectation that "RPM will always just work" by following "RPM-compliant behavior". Again, the basic rules are:

  1. If there's an RPM version preferably distributed by your distribution provider or by one of the bigger RPM-providing sites that have RPM versions of all of the dependencies for you, then use it, and
  2. if you can't get what you need out of that, build your own RPM of the package so that, in the future you can leverage today's work down the road, when the next version comes out. (Think, "If I spent an hour on the .spec file for version 6.8-1 today, I'll just have to copy it and tweak for 6.9 when they finish that double-secret-pinkney-flange-parsing feature I want to include...")

Again, other package managers may be more flexible when it comes to this ('apt-install --apache-is-really-in=/opt/apache --please-work php-mysql'?), but given the very nature of binary packages, it'd be news to me if that were possible. At the very least, expect a stint in "Symlink Hell" if you aren't careful about mixing any package manager with apps you compile from source outside of the package manager's framework.

See also