Versioning Python projects

In the recent past I’ve typically versioned software based on either its most recent tag or, for non-release versions, the commit short SHA–the seven- or eight-digit hexadecimal hash of the latest update. Because commit hashes are neither ordered nor intrinsically memorable, I’ll add the commit’s date in the form YYYYMMDD to remind myself at a glance how old is the thing I’m looking at. A major utility of this is to let me know I’m debugging the most recent version of the software–with automated deployments it’s possible some technohiccup or stray gamma radiation interferes and the updated app isn’t deployed yet.

For the first time ever this week I packaged up a small program for including in a private PyPi index. The primary reason for this is that the audience for the script is sysadmins, not necessarily Python programmers or even used to using Git.

In the code, the version string is typically defined in a file version.py:

version = 'local development version'

This file is then included wherever the version string is needed. In a command-line program, it’s probably printed out or logged at startup; in a web application it’ll be available via an HTML comment in the source or displayed somewhere unobtrusively. This worked fine until recently.

In following the GitLab instructions on publishing to the built-in package registry I learned that, as is true for all things, there is a PEP describing what’s appropriate here. Right off the bat, “local development version” doesn’t match acceptable version string formats, and neither does 20210113/4de6e191.

PEP 440 prescribes version formats for Python projects. As with all the PEPs I’ve read or skimmed it’s reasoned and descriptive. I’ll have to make a couple of changes.

For what I refer to as releases–basically anything I’ve tagged–the format is fine as it is. I use a major.minor scheme, this is generally enough for what I do with Python. This adheres to the Public version identifiers section of the scheme and hopefully this is all “regular” users will ever see.

My CI is set up so that any tagged commit is “branded” with the tag into the version string in version.py before the container image is built or the script is packaged and uploaded.

But like I mentioned earlier, I need something different for how I version development builds. I still need to deploy these or package them up–though the latter may be much less common once my project’s reached any maturity–so I need a versioning method that complies with PEP440.

The Local version identifiers section of the PEP doesn’t exactly track to my use case. The primary intent is for API-compatible local patches on upstream projects. However, the format of the local version identifier is robust enough for what I want to express, and the use case in my mind sort of fits if I squint and maybe dim the lights.

The other option I could use would be the Development release format. This is basically x.y.devN where N is a non-negative integer. I could do this based on the commit distance from the last tagged version, so that the release is 1.1 and three commits later the development version deployed is 1.1.dev3. That’s fine and nice and simple and I might go there at some point, but what I don’t like is that with multiple branches there could be two snapshots of the code with the same tag.

So, back to the local version. The format is, in addition to the public version identifier, a plus sign and then some sequence of letters, digits and periods, starting with a letter or digit (and presumably ending with one). The commit hash (short form) fits nicely, so I could have something like 1.1+4de6e191.

When looking for the best way to get this hash without a bunch of git log pipe through tail pipe through awk nonsense I came across git describe. Long story short, so long as there is at least one tag in the path from the current node to the root, git describe --tags will come up with a little string like this:

v1.2-12-g894c33c

The above indicates that:

  • v1.2 is the most recent tag in this branch (the v is allowed by PEP440 but extraneous and discouraged)
  • 12 is the number of commits since that tag
  • g indicates this reference is to a Git repository
  • 894c33c is the seven-digit commit hash uniquely identifying the commit.

So this version scheme tells me at a glance how far from the last release this version is, and if I need to, tells me exactly where in the development history it came from.

This doesn’t adhere to the local versioning scheme though so I have to run it through a couple of filters:

$ git describe --tags | sed -e 's/-/+/' -e 's/-/./g'
v1.2+12.g894c33c

There we go.

The only other thing is, returning to the original version stub given above, local development version doesn’t comply–and if I’m just testing the packaging locally, this can be a problem. I’ve now updated this string to be 0+local.dev.version. It’s ugly but it works and nobody should see it but me.


Comments