Now that the broken laptops are mostly in a usable state (well, about 67% anyway) and I have some spare time at the end of vacation, I want to get Kubernetes running on them. First, I need a base install. And I don’t want to have to do that manually. Setting up a PXE boot server so these things automatically install an OS should be fairly straightforward, right?
Why do I punish myself this way?
Despite having managed systems like this before, for literally years, this was more painful than I expected. I never learn, though–I do things differently every time (because exciting! new! things!) and then I have to figure out a whole bunch of new stuff even though I’m doing the same old thing. (Because I’m not, really.)
In January 2018 I bought a bag (literally) of broken laptops from a local guy wanting to get out of the laptop-repair business. Of the six laptops, all of which had higher specs than my home desktop–I am not a fancy man–I eventually managed to get five of them to work in some capacity. The sixth, alas, had no screen, which meant it would completely refuse to boot and according to the intertubes there was no way around it, at least with that model.
The most powerful laptop with the nicest screen had broken and irreparable hinge assembly, so it was basically useful for laptop use. I re-soldered the loose power connection, which was incredibly tricky given the lack of wire to work with, and mounted it on a piece of aluminum tread plate I bought at a hardware store, and now it hangs on my wall. With a wireless keyboard/mouse combo, it’s my new workstation/object d’art.
That leaves me with four laptops for my Kubernetes cluster, which is completely adequate. One still has a thermal issue because it requires a thermal pad instead of paste for its graphics chip, and none of the local shops know what a thermal pad is. So I have three I can use.
Except I’m temporarily using one for temporary purposes I haven’t gotten it off of yet, so I have two. But that’s fine. I will have one master and one worker. That’s good enough to start, and with this automated process I should be able to rebuild pretty easily. So let’s go.
I am working towards a Kubespray installation. I have used Kubespray at work a number of times (and have submitted various patches to its configuration and documentation) but that has all been with OpenStack. Since I will be using bare metal, deploying Kubernetes should be quite a bit simpler. I have found with Kubespray that the bug discovery, patching and reconfiguration I have to do literally every time I deploy is with the Terraform stuff that handles OpenStack to build a node inventory. The Ansible part that takes over is generally fairly solid, since it’s much more tested.
So to that end I need to have these laptops built up with a base OS install with SSH access, hopefully passwordless. Each node must have its hostname defined but that’s about it.
Build up intranet management server
I have set up a separate intranet for the Kubernetes cluster. My old workstation, Blue, has two NICs and I’ve bridged the second, previously unused NIC for this purpose. Blue also has mirrored hard drives and so is a sturdy box for the more critical tasks. It runs LXC containers under LXD. LXC containers are more system-like than Docker containers, so it’s sort of like “lite” VMs. One of these containers, Cyan, will be the intranet management server and will tell the laptops who they are and what they do.
To start with, I created a bridge interface on Blue using the second NIC.
k8sbr I now use the
k8sbr device instead of
eth1. I have
the following in
# k8s cluster auto k8sbr iface k8sbr inet static address 192.168.4.1 network 192.168.4.0 netmask 255.255.255.0 gateway 192.168.4.1 broadcast 192.168.4.255 dns-nameservers 192.168.1.2 192.168.1.254 bridge_ports eth1 bridge_fd 0 bridge_maxwait 0
Blue is an Ubuntu 16.04 box. Newer releases use Netplan so that configuration will be different.
When I originally created Cyan, I built it up using
lxc commands. I then
set this whole project aside for months. Now I rebuilt Cyan using
Terraform which has worked quite well although I am
still working some things out.
Once Cyan was up, I added the bridged network for the Kubernetes cluster using:
lxc config device add cyan eth1 nic nictype=bridged parent=k8sbr
This means Cyan has the regular LXD network on eth0 and the Kubernetes (“k8s”) intranet on eth1. This allows me to get to Cyan from workstations without using a bastion host (aka jump host) because the default network for the LXC containers is bridged to the normal house intranet.
I also have the Ubuntu 18.04 ISO mounted on Blue to
and make it available on Cyann wih the following:
lxc config device add cyan u18iso disk source=/mnt/ubuntu-18.04.2-iso path=/mnt/ubuntu-18.04.2-iso
This will be necessary later.
Once Cyan is ready for direct management, I log in and set up DHCP and DNS services. In a previous iteration I have used the ISC DHCP server package, with which I am somewhat familiar. On the second build (remember how I mentioned I set this aside for some time) I restarted with Dnsmasq. This is what is used on OpenWRT which I ran for years on an early generation WRT-54G router and it’s simple, well-documented, and stable, and handles DHCP, DNS and TFTP, while providing the sample configuration for PXE booting.
The following is the result of
grep -Ev '^#|^\s*$' /etc/dnsmasq.conf with
some comments added back in:
# I don't want to serve DHCP for the house intranet. This is badly serviced # by the wireless router--I might replace that but not yet. no-dhcp-interface=eth0 # Add domain to hostnames expand-hosts domain=k8s # DHCP range dhcp-range=192.168.4.16,192.168.4.128,12h # read MAC addresses from /etc/ethers, which is just sort of tidy read-ethers # these provide necessary information for PXE booting dhcp-option-force=208,f1:00:74:7e dhcp-option-force=210,/tftp/ dhcp-option-force=211,30i dhcp-boot=pxelinux.0 # enable and configure TFTP enable-tftp tftp-root=/tftp # only serve files under /tftp which are owned by user running dnsmasq tftp-secure
Dnsmasq is straightforward enough that this basically worked on the first try.
It took me a little time to work through the well-commented, self-documented
/etc/dnsmasq.conf provided by the APT package, but I basically got
everything right. This is a pretty impressive piece of software.
Cyan will also need to provide an outgoing network route for the k8s intranet. Enter good old IP masquerading.
The following commands must be applied to support this:
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE iptables -A FORWARD -i eth1 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT iptables -A FORWARD -i eth0 -o eth1 -j ACCEPT
Also, IP forwarding must be turned on, which seems to be the default:
net.ipv4.ip_forward = 1
If not, this can be implemented through
sysctl and made persistent by
/etc/sysctl.conf or adding [to] an appropriate file in
PXE stands for Pre-eXecution Environment and are a set of instructions to give to a booting computer to tell it what to do before it loads the OS. When you choose “Network boot” from your computer’s BIOS boot manager, this is what it means.
PXE booting works by sending out a DHCP query and receiving as part of the normal response an extended portion that identifies a file to download and execute. This file is similar to the SYSLINUX boot execution typically used to boot Linux machines and is actually managed under that project. This sets up a minimal operating environment to begin OS installation.
TFTP boot and sources
TFTP (Trivial File Transfer Protocol) is handled by Dnsmasq, as configured above. TFTP is a basic file transfer service with no authentication and presumably simpler to implement than a basic web server and client (full disclosure: I’ve never implemented TFTP or HTTP protocols) and so suitable for the minimal operating footprint of a booting client.
The first file made available for PXE booting is
pxelinux.0. This is the
file pointed to by the DHCP extensions described above. Once this loads, it
looks for a PXE configuration for the host by searching
pxelinux.cfg for the
first file matching the host, in the following hierarchy:
abcdef01-2345-6789-abcd-ef0123456789 01-AA-BB-CC-DD-00-11 C0A8025B C0A8025 C0A802 C0A80 C0A8 C0A C0 C default
In this example, the first file will match a client with this UUID, the second
will match a client with that MAC address (prepended with
01-), the files
after that represent IPv4 addresses converted to hex with decreasing
specificity, until the last matches any host that hasn’t matched on the
I have only created
default because any host using this DHCP server during a
network boot should be an install candidate. When the machines boot off of
their hard drives and are already running the OS when they first send out a
DHCP request, they will ignore the PXE boot extensions. (Possibly they won’t
request them. I could go look that up, but not right now.)
I also recursively copy the contents of
/mnt/ubuntu-18.04.2-iso/install/netboot to the TFTP root. These files are
either expected by the PXE boot binary in
pxelinux.0 or referenced by the
configuration, described in the next section.
The PXE configuration contained in
/tftp/pxelinux.cfg/default looks like
DEFAULT linux SAY Now starting k8s node install from PXE... LABEL linux KERNEL ubuntu-installer/amd64/linux APPEND auto=true vga=788 initrd=ubuntu-installer/amd64/initrd.gz \ preseed/url=tftp://192.168.4.3/preseed/k8s.preseed \ preseed/interactive=false locale=en_CA.UTF-8 \ console-setup/ask_detect=false console-setup/layoutcode=us \ keyboard-configuration/layoutcode=us \ mirror/http/mirror=ca.archive.ubuntu.com netcfg/get_hostname=
This tells the booting kernel to load the installer’s kernel and initial ramdisk, where to find the preseed configuration, and sets some options for the Ubuntu installer that are relevant before the preseed file is downloaded, such as locale and keyboard layout.
Preseeding was not straightforward.
Preseeding is Ubuntu’s (actually, Debian’s) automation of its installer. Every decision requiring interactive response in the Ubuntu installer can have an answer pre-selected in the preseeding configuration. This is similar to the KickStart system developed by Red Hat for that distribution, but less well documented, less user-friendly and less popular–according to my experience, opinion and impression, respectively, at least. I have experience with KickStart and found Ubuntu’s preseeding to be rather finicky.
Actually, scratch that: samples are well-commented, and I basically configured the preseeding configuration using a current example from the 18.04 documentation in the same manner as I configured Dnsmasq, but a lot more tweaking, online research and frustration was required.
Preseeding configuration is way out of scope, but for reference, here is the configuration that has worked at this point:
d-i debian-installer/language string en d-i debian-installer/country string CA d-i debian-installer/locale string en_US.UTF-8 d-i console-setup/ask_detect boolean false d-i keyboard-configuration/xkb-keymap select us d-i netcfg/choose_interface select auto d-i netcfg/get_hostname string unassigned-hostname d-i netcfg/get_domain string unassigned-domain d-i netcfg/wireless_wep string d-i mirror/country string ca d-i mirror/http/hostname string ca.archive.ubuntu.com d-i mirror/http/directory string /ubuntu d-i mirror/http/proxy string d-i passwd/root-login boolean true d-i passwd/make-user boolean false d-i passwd/root-password-crypted password <crypt> d-i user-setup/encrypt-home boolean false d-i clock-setup/utc boolean true d-i time/zone string Canada/Pacific d-i clock-setup/ntp boolean true d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-lvm/device_remove_lvm boolean true d-i partman-md/device_remove_md boolean true d-i partman-auto/choose_recipe select atomic d-i partman-md/confirm boolean true d-i partman-partitioning/confirm_write_new_label boolean true d-i partman/choose_partition select finish d-i partman/confirm boolean true d-i partman/confirm_nooverwrite boolean true d-i live-installer/net-image string tftp://192.168.4.3/ubuntu/install/filesystem.squashfs tasksel tasksel/first multiselect openssh-server d-i pkgsel/upgrade select safe-upgrade d-i pkgsel/language-packs multiselect en d-i pkgsel/update-policy select unattended-upgrades d-i pkgsel/updatedb boolean false d-i grub-installer/only_debian boolean true d-i grub-installer/with_other_os boolean true d-i grub-installer/bootdev string /dev/sda d-i finish-install/reboot_in_progress note d-i cdrom-detect/eject boolean false
This almost certainly involves unnecessary statements as I struggled to figure out how to skip the mirror selection. It’s worth grabbing a recent example configuration and working through the well-commented file, with the above as an added reference for a known-to-work configuration.
I have configured the BIOS on the client laptops to boot from the hard drive by default. In this way, if I ever want to rebuild the OS on one of these machines, I press F12 at the BIOS boot screen to engage the network boot, but otherwise, the laptop will boot as previously configured. It is possible to control this via the PXE boot process itself by configuring the BIOS to boot from the network and updating the PXE configuration to instruct the machine to boot from its disk or re-install as desired, but for this situation this is simpler to implement and manage going forward.
In order to avoid confusion about what NIC to boot with I have removed the WiFi card from each laptop. This is a very simple operation on most laptops for some reason–changing out the hard drive is often harder. I could specify the NIC in the boot parameters but since I don’t need WiFi and I don’t want to worry about configuring (disabling) unnecessary networks (and attack vectors) I have simply removed the cards. I’ve set them aside just in case, and will undoubtedly come across them years from now and remember this project fondly, long after the laptops have been retired and recycled.