Cluster, Part 6: Superglue Service Discovery | Setting up Consul | Stardust

Cluster, Part 6: Superglue Service Discovery | Setting up Consul

Hey, welcome back to another weekly installment of cluster configuration for colossal computing control. Apparently I'm managing to keep this up as a weekly series every Wednesday.

Last week, we sorted out managing updates to the host machines in our cluster to keep them fully patched. We achieved this by firstly setting up an apt caching server with apt-cacher-ng. Then we configured our host machines to use it. Finally, we setup automated updates with unattended-upgrades so we don't have to keep installing them manually all the time.

For reference, here are all the posts in this series so far:

In this part, we're going to install and configure Consul - the first part of the Hashicorp stack. Consul doesn't sound especially exciting, but it is an extremely important part of our (diabolical? I never said that) plan. It serves a few purposes:

Clusters together, so Nomad (the task scheduler) can find other nodes
Keeps track of which services are running where

It uses the Raft Consensus Algorithm (like wesher from part 4; they actually use the same library under-the-hood it would appear) to provide a relatively decentralised approach to the problem, allowing for some nodes to fail without impacting the cluster's operation as a whole.

It also provides a DNS API, which we'll be tying into with Unbound later in this post.

Before continuing, you may find reading through the official Consul guides a useful exercise. Try out some of the examples too to get your head around what Consul is useful for.

(Above: Nasa's DSN dish in Canberra, Australia just before major renovations are carried out. Credit: NASA/Canberra Deep Space Communication Complex)

Installation and Preamble

To start with, we need to install it. I've done the hard work of packaging it already, so you can install it from my apt repository - which, if you've been following this series, you should have it setup already (if not, follow the link and read the instructions there).

Install consul like this:

sudo apt install consul

Next, we need a systemd service file. Again, I have packages in my apt repository for this. There are 2 packages:

hashicorp-consul-systemd-client
hashicorp-consul-systemd-server

The only difference between the 2 packages is where it reads the configuration file from. The client package reads from /etc/consul/client.hcl, and the server from /etc/consul/server.hcl. They also conflict with each other, so you can't install both at the same time. This is because - as far as I can tell - servers can expose the client interface in just the same way as any other normal client.

To get a feel for the bigger picture, let's talk architecture. Because Consul uses the Raft Consensus Algorithm, we'll need an odd number of servers to avoid issues (if you use an even number of servers, then you run the risk of a 'split brain', where there's no clear majority vote as to who's the current leader of the cluster). In my case, I have 5 Raspberry Pi 4s:

1 x 2GB RAM (controller)
4 x 4GB RAM (workers)

In this case, I'm going to use the controller as my first Consul server, and pick 2 of the workers at random to be my other 2, to make up 3 servers in total. Note that in future parts of this series you'll need to keep track of which ones are the servers, since we'll be doing this all over again for Nomad.

With this in mind, install the hashicorp-consul-systemd-server package on the nodes you'll be using as your servers, and the hashicorp-consul-systemd-client package on the rest.

A note about cluster management

This is probably a good point to talk about cluster management before continuing. Specifically the automation of said management. Personally, I have the goal of making the worker nodes completely disposable - which means completely automating the setup from installing the OS right up to being folded into the cluster and accepting jobs.

To do this, we'll need a tool to help us out. In my case, I've opted to write one from scratch using Bash shell scripts. This is not something I can recommend to anyone else, unless you want to gain an understanding of how such tools work. My inspiration was efs2, which appears to be a Go program - and Docker files. As an example, my job file for automating the install of a Consul client agent install looks like this:

#!/usr/bin/env bash

SCRIPT "${JOBFILE_DIR}/common.sh";

COPY "../consul/server.hcl" "/tmp/server.hcl"

RUN "sudo mv /tmp/server.hcl /etc/consul/server.hcl";
RUN "sudo chown root:root /etc/consul/server.hcl";
RUN "sudo apt-get update";
RUN "sudo apt-get install --yes hashicorp-consul-systemd-server";

RUN "sudo systemctl enable consul.service";
RUN "sudo systemctl restart consul.service";

...I'll be going through all steps in a moment. Of course, if there's the demand for it then I'll certainly write a post or 2 about my shell scripting setup here (comment below), but I still recommend another solution :P

Note that the firewall configuration is absent here - this is because I've set it to allow all traffic on the wgoverlay network interface, which I talked about in part 4. If you did want to configure the firewall, here are the rules you'd need to create:

sudo ufw allow 8301 comment consul-serf-lan;
sudo ufw allow 8300/tcp comment consul-rpc;
sudo ufw allow 8600 comment consul-dns;

Many other much more mature tools exist - you should use one of those instead of writing your own:

Ansible - uses YAML configuration files; organises things logically into 'playbooks' (personally I really seriously can't stand YAML, which is another reason for writing my own)
Puppet
and more

The other thing to be aware of is version control. You should absolutely put all your configuration files, scripts, Ansible playbooks, etc under version control. My preference is Git, but you can use anything you like. This will help provide a safety net in case you make an edit and break everything. It's also a pretty neat way to keep it all backed up by pushing it to a remote such as your own Git server (you do have automated backups, right?), GitHub, or GitLab.

Configuration

Now that we've got that sorted out, we need to deal with the configuration files. Let's do the server configuration file first. This is written in the Hashicorp Configuration Language. It's probably a good idea to get familiar with it - I have a feeling we'll be seeing a lot of it. Here's my full server configuration (at the time of typing, anyway - I'll try to keep this up-to-date).

bind_addr = "{{ GetInterfaceIP \"wgoverlay\" }}"

# When we have this many servers in the cluster, automatically run the first leadership election
# Remember that the Hashicorp stack uses the Raft consensus algorithm.
bootstrap_expect = 3
server = true
ui = true

client_addr = "127.0.0.1 {{ GetInterfaceIP \"docker0\" }}"

data_dir = "/srv/consul"
log_level = "INFO"

domain = "mooncarrot.space."


retry_join = [
    // "172.16.230.100"
    "bobsrockets",
    "seanssatellites",
    "tillystelescopes"
]

This might look rather unfamiliar. There's also a lot to talk about, so let's go through it bit by bit. If you haven't already, I do suggest coming up with an awesome naming scheme for your servers. You'll thank me later.

The other thing you'll need to do before continuing is buy a domain name. It sounds silly, but it's actually really important. As we talked about in part 3, we're going to be running our own internal DNS - and Consul is a huge part of this.

By default, Consul serves DNS under the .consul top-level-domain, which both unregistered and very bad practice (because it's unregistered). Someone could come along tomorrow and regsister and start using the .consul top-level domain, and then things would get awkward if you ever wanted to visit an external domain that ends in .consul that you're using internally.

I've chosen mooncarrot.space myself, but if you don't yet have one, I recommend taking your time and coming up with a really clever one you like - since you'll be using it for a long time - and updating it later is a pain in the behind. If you're looking for a recommendation for a DNS provider, I've been finding Gandi great so far (much better than GoDaddy, who have tons of hidden charges).

Once you've decided on a domain name (and bought it, if necessary), set it in the server configuration file via the domain directive:

domain = "mooncarrot.space."

Don't forget that trailing dot. As we learned in part 3, it's important, since it indicates an absolute domain name.

Also note that I'm using a subdomain of the domain in question here. This is because of an issue whereby I'm unable to get Unbound to forward that Consul is unable to resolve on to CloudFlare.

Another thing of note is the data_dir directive. Note that this is the data storage directive local to the specific node, not shared storage (we'll be tackling that in a future post).

The client_addr directive here tells Consul which network interfaces to bind the client API to. In our case, we're binding it to the local loopback (127.0.0.1) and the docker0 network interface by dynamically grabbing it's IP address - so that docker containers on the host can use the API.

The bind_addr directive is similar, but for the inter-node communication interfaces. This tells Consul that the other nodes in the Cluster are accessible over the wgoverlay interface that we setup in part 4. This is important, since Consul doesn't encrypt or authenticate it's traffic by default as far as I can tell - and I haven't yet found a good way to do this that doesn't involve putting a password directly into a configuration file.

In this way the WireGuard mesh VPN provides the encryption & authentication that Consul lacks by default (though I'm certainly going to be looking into it anyway).

bootstrap_expect is also interesting. If you've decided on a different number of Consul server nodes, then you should change this value to equal the number of server nodes you're going to have. 3, 5, and 7 are all good numbers - though don't go overboard. More servers means more overhead. Servers take more computing power than clients, so try not to have too many of them.

Finally, retry_join is also very important. It should contain the domain name of all the servers in the cluster. In my case, I'm using the be the hostnames of the other servers in the network, since Wesher (our WireGuard mesh VPN program) automatically adds the machine names of all the nodes in the VPN cluster to your /etc/hosts file. In this way we ensure that the Cluster always talks over the wgoverlay VPN network interface.

Oh yeah, and I should probably note here that your servers should not use FQDNs (Fully Qualified Domain Names) as their hostnames. I found out the hard way: it messes with Consul, and it ends up serving node IPs via DNS on something like harpsichord.mooncarrot.space.node.mooncarrot.space instead something sensible like harpsichord.node.mooncarrot.space. If anyone has a solution for this that doesn't involve using non-FQDNs as hostnames, I'd love to know (since FQDNs as hostnames is my preference).

That was a lot of words. Next, let's do the client configuration file:

bind_addr = "{{ GetInterfaceIP \"wgoverlay\" }}"

bootstrap = false
server = false

domain = "mooncarrot.space."

client_addr = "127.0.0.1 {{ GetInterfaceIP \"docker0\" }}"

data_dir = "/srv/consul"
log_level = "INFO"

retry_join = [
    "wopplefox",
    "spatterling",
    "sycadil"
]

Not much to talk about here, since the configuration is almost indentical to that of the server, except you don't have to tell it how many servers there are, and retry_join should contain the names of the servers that the client should try to join, as explained above.

Once you've copied the configuration files onto all the nodes in your cluster (/etc/consul/server.hcl for servers; /etc/consul/client.hcl for clients), it's now time to boot up the cluster. On all nodes (probably starting with the servers), do this:

# Enable the Consul service on boot
sudo systemctl enable consul.service
# Start the Consul service now
sudo systemctl start consul.service
# Or, you can do it all in 1 command:
sudo systemctl enable --now consul.service

It's probably a good idea to follow the cluster's progress along in the logs. On a server node, do this after starting the service & enabling start on boot:

sudo journalctl -u consul --follow

You'll probably see a number of error messages, but be patient - it can take a few minutes after starting Consul on all nodes for the first time for them to start talking to each other, sort themselves out, and calm down.

Now, you should have a working Consul cluster! On one of the server nodes, do this to list all the servers in the cluster:

consul members

If you like, you can also run this from your local machine. Simply install the consul package (but not the systemd service file), and make some configuration file adjustments. Update your ~/.bashrc on your local machine to add something like this:

export CONSUL_HTTP_ADDR=http://consul.service.mooncarrot.space:8500;

....replacing mooncarrot.space with your own domain, of course :P

Next, update the server configuration file to make the client_addr directive look like this:

client_addr = "127.0.0.1 {{ GetInterfaceIP \"docker0\" }} {{ GetInterfaceIP \"wgoverlay\" }}"

Upload the new version to your servers, and restart them one at a time (unless you're ok with downtime - I'm trying to practice avoiding downtime now so I know all the processes for later):

sudo systemctl restart consul.service

At this point, we've got a fully-functioning Consul cluster. I recommend following some of the official guides to learn more about how it works and what you can do with it.

Unbound

Before we finish for today, we've got 1 more task to complete. As I mentioned back in part 3, we're going to configure our DNS server to conditionally forward queries to Consul. The end result we're aiming for is best explained with a diagram:

A diagram showing how we're aiming for Unbound to resolve queries.

In short:

Try the localzone data
If nothing was found there (or it didn't match), see if it matches Consul's subdomain
If so, forward the query to Consul and return the result
~~If Consul couldn't resolve the query, forward it to CloudFlare via DNS-over-TLS~~

The only bit we're currently missing of this process is the Consul bit, which we're going to do now. Edit /etc/unbound/unbound.conf on your DNS server (mine is on my controller node), and insert the following:

###
# Consul
###
forward-zone:
    name: "node.mooncarrot.space."
    forward-addr: 127.0.0.1@8600
forward-zone:
    name: "service.mooncarrot.space."
    forward-addr: 127.0.0.1@8600

...replace mooncarrot.space. with your domain name (not forgetting the trailing dot, of course). Note here that we have 2 separate forward zones here.

Unfortunately, I can't seem to find a way to get Unbound to fall back to a more generic forward zone in the event that a more specific one is unable to resolve the query (I've tried both a forward-zone and a stub-zone). To this end, we need to define multiple more specific forward-zones if we want to be able to forward queries out to CloudFlare for additional DNS records. Here's an example:

tuner.service.mooncarrot.space is an internal service that is resolved by Consul
peppermint.mooncarrot.space is an externally defined DNS record defined with my registrar

If we then ask Unbound to resolve then both, only #1 will be resolved correctly. Unbound will do something like this for #2:

Check the local-zone for a record (not found)
Ask Consul (not found)
Return error

If you are not worried about defining DNS records in your registrar's web interface / whatever they use, then you can just do something like this instead:

###
# Consul
###
forward-zone:
    name: "mooncarrot.space."
    forward-addr: 127.0.0.1@8600

For advanced users, the Consul's documentation on the DNS interface is worth a read, as it gives the format of all DNS records Consul can service.

Note also here that the recursors configuration option is an alternative solution, but I don't see an option to force DNS-over-TLS queries there.

If you have a better solution, please get in touch by commenting below or answering my ServerFault question.

With this done, you should be able to ask Consul what the IP address of any node in the cluster is like so:

dig +short harpsichord.node.mooncarrot.space
dig +short grandpiano.node.mooncarrot.space
dig +short oboe.node.mooncarrot.space
dig +short some_machine_name_here.node.mooncarrot.space

Again, you'll need to replace mooncarrot.space of course with your domain name.

Conclusion

Phew! There was a lot of steps and moving parts to setup in this post. Take your time, and re-read this post a few times to make sure you've got all your ducks in a row. Make sure to test your new Consul cluster by reading the official guides as mentioned above too, as it'll cause lots of issues later if you've got bugs lurking around in Consul.

I can't promise that it's going to get easier in future posts - it's probably going to get a lot more complicated with lots more to keep track of, so make sure you have a solid understanding of what we've done so far before continuing.

To summarise, we've managed to setup a brand-new Consul cluster. We've configured Unbound to forward queries to Consul, to enable seamless service discovery (regardless of which host things are running on) later on.

We've also picked an automation framework for automating the setup and configuration of the various services and such we'll be configuring. I recommend taking some time to get to know your chosen framework - many have lots of built-in modules to make common tasks much easier. Try going back to previous posts in this series (links at the top of this post) and implementing them in your chosen framework.

Finally, we've learnt a bit about version control and it's importance in managing configuration files.

In the next few posts in this series (unless I get distracted - likely - or have a change of plans), we're going to be setting up Nomad, the task scheduler that will be responsible for managing what runs where and informing Consul of this. We're also going to be setting up and configuring a private Docker registry and Traefik - the latter of which is known as an edge router (more on that in future posts).

See you next time, when we dive further down into what's looking less like a rabbit hole and more like a cavernous sinkhole of epic proportions.

Found this useful? Confused about something? Got a suggestion? Comment below! It's really awesome to hear that my blog posts have helped someone else out.

Type this	To get this	Notes
`bold text`	bold text	-
`_italics text_`	italics text	-
`~~deleted text~~`	~~deleted~~	-
`code text`	`code text`	Inserts some monospaced code. It is preferred that large blocks of code are linked to using a service such as Pastebin, Github Gists or Ideone.
`> Quote`	Quote	-
`[display text](//google.com)`	display text	Inserts a hyperlink. Please use responsibly. `[rel=nofollow]` is in use and spam will be deleted.
`---`		Inserts a horizontal line. The previous line must be blank.

Stardust
Blog