Versioning Python projects
In the recent past I’ve typically versioned software based on either its most recent tag or, for non-release versions, the commit short SHA–the seven- or eight-digit hexadecimal hash of the latest update. Because commit hashes are neither ordered nor intrinsically memorable, I’ll add the commit’s date in the form YYYYMMDD to remind myself at a glance how old is the thing I’m looking at. A major utility of this is to let me know I’m debugging the most recent version of the software–with automated deployments it’s possible some technohiccup or stray gamma radiation interferes and the updated app isn’t deployed yet.
For the first time ever this week I packaged up a small program for including in a private PyPi index. The primary reason for this is that the audience for the script is sysadmins, not necessarily Python programmers or even used to using Git.
In the code, the version string is typically defined in a file version.py
:
version = 'local development version'
This file is then included wherever the version string is needed. In a command-line program, it’s probably printed out or logged at startup; in a web application it’ll be available via an HTML comment in the source or displayed somewhere unobtrusively. This worked fine until recently.
In following the GitLab instructions on publishing to the built-in package
registry I learned
that, as is true for all things, there is a PEP describing what’s appropriate
here. Right off the bat, “local development version” doesn’t match acceptable
version string formats, and neither does 20210113/4de6e191
.
PEP 440 prescribes version formats for Python projects. As with all the PEPs I’ve read or skimmed it’s reasoned and descriptive. I’ll have to make a couple of changes.
For what I refer to as releases–basically anything I’ve tagged–the format is
fine as it is. I use a major.minor
scheme, this is generally enough for
what I do with Python. This adheres to the Public version
identifiers
section of the scheme and hopefully this is all “regular” users will ever see.
My CI is set up so that any tagged commit is “branded” with the tag into the
version string in version.py
before the container image is built or the
script is packaged and uploaded.
But like I mentioned earlier, I need something different for how I version development builds. I still need to deploy these or package them up–though the latter may be much less common once my project’s reached any maturity–so I need a versioning method that complies with PEP440.
The Local version identifiers section of the PEP doesn’t exactly track to my use case. The primary intent is for API-compatible local patches on upstream projects. However, the format of the local version identifier is robust enough for what I want to express, and the use case in my mind sort of fits if I squint and maybe dim the lights.
The other option I could use would be the Development
release format. This is
basically x.y.devN
where N is a non-negative integer. I could do this based
on the commit distance from the last tagged version, so that the release is
1.1
and three commits later the development version deployed is 1.1.dev3
.
That’s fine and nice and simple and I might go there at some point, but what I
don’t like is that with multiple branches there could be two snapshots of the
code with the same tag.
So, back to the local version. The format is, in addition to the public
version identifier, a plus sign and then some sequence of letters, digits and
periods, starting with a letter or digit (and presumably ending with one).
The commit hash (short form) fits nicely, so I could have something like
1.1+4de6e191
.
When looking for the best way to get this hash without a bunch of git log pipe
through tail pipe through awk nonsense I came across git describe
. Long
story short, so long as there is at least one tag in the path from the
current node to the root, git describe --tags
will come up with a little
string like this:
v1.2-12-g894c33c
The above indicates that:
v1.2
is the most recent tag in this branch (thev
is allowed by PEP440 but extraneous and discouraged)12
is the number of commits since that tagg
indicates this reference is to a Git repository894c33c
is the seven-digit commit hash uniquely identifying the commit.
So this version scheme tells me at a glance how far from the last release this version is, and if I need to, tells me exactly where in the development history it came from.
This doesn’t adhere to the local versioning scheme though so I have to run it through a couple of filters:
$ git describe --tags | sed -e 's/-/+/' -e 's/-/./g'
v1.2+12.g894c33c
There we go.
The only other thing is, returning to the original version stub given above,
local development version
doesn’t comply–and if I’m just testing the
packaging locally, this can be a problem. I’ve now updated this string to be
0+local.dev.version
. It’s ugly but it works and nobody should see it but
me.