Adding bricks to the k8s/gluster cluster
I’ve brought a second node into the cluster but it didn’t go perfectly right off the bat so the third brick will be the proof. The reason it failed is that I had some fancy automation set up that already created an unattached volume on the second node, but adding pre-existing volumes is ludicrous, now that I think about it, and I only thought about it once I tried to do it and was told “No.”
Here’s the story.
Adding the second brick
I’ve expanded MicroK8s clusters before but never Gluster, so let’s try Gluster first. I’ve already got a volume going, and I want to extend the volume to another brick–another storage node. I’ve got the second node set up similarly, with the same size partition mounted with the same filesystem. In fact, I’ve gone the extra distance and created a volume there, exactly the same as I have done on the first node.
To add nodes to a Gluster cluster, one probes it from the first node. This adds the node to the pool of trusted peers. This failed immediately.
brick0 $ sudo gluster gluster> peer probe brick1 peer probe: failed: brick1 is either already part of another cluster or having volumes configured
This makes sense. So eventually I go to brick1 and issue
sudo gluster stop gv0, then
sudo gluster delete gv0. Now Gluster on brick1 believes the
volume is gone, and I was able to connect from brick0.
gluster> peer status Number of Peers: 0 gluster> peer probe brick1 peer probe: success. gluster> peer status Number of Peers: 1 Hostname: brick1 Uuid: 5d0e6754-4211-4105-b001-58bc83dc4bd6 State: Peer in Cluster (Connected)
Next I tried to add a Gluster brick and that didn’t go well, either. In three attempts, I got the following messages from Gluster:
volume add-brick: failed: Pre Validation failed on brick1. Failed to create brick directory for brick brick1:/data/brick1/gv0. Reason : No such file or directory
This again. I don’t remember if I did anything on brick1.
volume add-brick: failed: Pre Validation failed on brick1. /srv/brick1/gv0 is already part of a volume
Here’s where I go back to brick1 and try to clean up everything again. What I didn’t realize is that the previous attempt actually managed to get the two Gluster bricks in the cluster. I’m not certain of this, because I never checked on brick0: it looked like the operation had been completely unsuccessful. But, it must have, and I must have issued the command to delete gv0, because then, on the third attempt in this sequence to add brick1’s gv0:
volume add-brick: failed: Unable to get volinfo for volume name gv0
I find there are no volumes in the cluster at all, including the one on brick0.
The volume was still there, and I could even see the test data I’d placed
there yesterday. I’d just stopped the volume and torn it down. I spent a
little time trying to find the equivalent of mdadm’s
such that Gluster would recognize an inactive volume and re-activate it. In
the end it was this:
gluster> volume create gv0 brick0:/srv/brick1/gv0 volume create: gv0: failed: /srv/brick1/gv0 is already part of a volume gluster> volume create gv0 brick0:/srv/brick1/gv0 force volume create: gv0: success: please start the volume to access data gluster> volume start gv0 volume start: gv0: success
So, back to where I was at the beginning, but I’ve got my volume back.
In the end what worked was, with brick0’s gv0 a single-brick volume, I went
back to brick1, stopped and deleted its gv0 and stopped Gluster and
/srv/brick1/gv0 directory. Then I started Gluster back up, and back on
brick0, saw the peer was connected again using
pool status, and then:
gluster> volume add-brick gv0 replica 2 brick1:/srv/brick1/gv0 Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/. Do you still want to continue? (y/n) y volume add-brick: success
By the way, before confirming, I found documentation on this (the link given was a 404) and I think I’m fine since I will be adding another replica after this. (Split-brain is when multiple storage nodes have inconsistent views of the data–I assume with three replicas, quorum will be more effective.)
I poked around a bit and it looks like all the basics are working. I am ready to go with the third node and then go start to expand k8s similarly.
Adding the third brick
I brought it up with Gluster installed and the partition and filesystem ready, so on the first brick:
$ sudo gluster gluster> peer probe brick2 peer probe: success. gluster> peer status Number of Peers: 2 Hostname: brick1 Uuid: 5d0e6754-4211-4105-b001-58bc83dc4bd6 State: Peer in Cluster (Connected) Hostname: brick2 Uuid: dc05b74e-0d8a-4704-b1d0-c855fdbf79ca State: Peer in Cluster (Connected) gluster> volume add-brick gv0 replica 3 brick2:/srv/brick1/gv0 volume add-brick: success gluster>
And that’s that.
Now back to k8s on the second brick
Bring up second node and configure it as for the first
Remember to create a new CSR template like the one for the first brick, and run
sudo microk8s refresh-certs
Enable important add-ons I forgot to earlier, on each node:
$ microk8s enable dns storage rbac
From first node, add node:
$ microk8s add-node
Copy the given command and paste that into a terminal window to the second brick.
$ microk8s join 10.0.0.16:25000/a1ca3395f08c9796977c3acca0689a2c
Repeating for the third brick went the same
I now have a high-availability K8s cluster:
$ microk8s status microk8s is running high-availability: yes datastore master nodes: 10.0.0.16:19001 10.0.0.17:19001 10.0.0.18:19001 datastore standby nodes: none
And that’s the cluster.
There’s my bricks. The next step is to deploy a few things here that I want and kick at the tires a bit before I go to the next level and invest the time in getting this going on the laptops.
The first thing to do in deploying this on the laptops will be to plan it out. For example, unlike these virtual machines, the laptops are heterogeneous, and particularly for Gluster, have different sized hard drives. Probably I’ll take maximize the disk use for Gluster on the smallest one (that is, everything but the OS partition) and that’s the size of the Gluster partition for the rest.
Choosing the right-sized partitions will be trickier with the laptops as well because I won’t be able to change them later. The laptop’s hard drives are also quite a bit bigger than I expect to need for local storage–so maybe splitting it with CockroachDB might be useful, but I don’t see needing a tonne of space for that either.