Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Cluster, Part 5: Staying current | Automating apt updates and using apt-cacher-ng

Hey there! Welcome to another cluster blog post. In the last post, we looked at setting up a WireGuard mesh VPN as a trusted private network for management interfaces and inter-cluster communication. As a refresher, here's a list of all the parts in this series so far:

Before we get to the Hashicorp stack though (next week, I promise!), there's an important housekeeping matter we need to attend to: managing updates.

In Debian-based Linux distributions such as Raspbian (that I'm using on my Raspberry Pis), updates are installed through apt - and this covers everything from the kernel to the programs we have installed - hence the desire to package everything we're going to be using to enable easy installation and updating.

There are a number of different related command-line tools, but the ones we're interested in are apt (the easy-to-use front-end CLI) and apt-get (the original tool for installing updates).

There are 2 key issues we need to take care of:

  1. Automating the installation of packages updates
  2. Caching the latest packages locally to avoid needless refetching

Caching package updates locally with apt-cacher-ng

Issue #2 here is slightly easier to tackle, surprisingly enough, so we'll do that first. We want to cache the latest packages locally, because if we have lots of machines in our cluster (I have 5, all running Raspbian), then when they update they all have to download the package lists and the packages themselves from the remote sources each time. Not only is this bandwidth-inefficient, but it takes longer and puts more strain on the remote servers too.

For apt, this can be achieved through the use of apt-cacher-ng. Other distributions require different tools - in particular I'll be researching and tackling Alpine Linux's apk package manager in a future post in this series - since I intend to use Alpine Linux as my primary base image for my Docker containers (I also intend to build my own Docker containers from scratch too, so that will be an interesting experience that will likely result in a few posts too).

Anyway, installation is pretty simple:

sudo apt install apt-cacher-ng

Once done, there's a little bit of tuning we need to attend to. apt-cacher-ng by default listens for HTTP requests on TCP port 3142 and has an administrative interface at /acng-report.html. This admin interface is not, by default, secured - so this is something we should do before opening a hole in the firewall.

This can be done by editing the /etc/apt-cacher-ng/security.conf configuration file. It should read something like this:


# This file contains confidential data and should be protected with file
# permissions from being read by untrusted users.
#
# NOTE: permissions are fixated with dpkg-statoverride on Debian systems.
# Read its manual page for details.

# Basic authentication with username and password, required to
# visit pages with administrative functionality. Format: username:password

AdminAuth: username:password

....you may need to use sudo to view and edit it. Replace username and password with your own username and a long unguessable password that's not similar to any existing passwords you have (especially since it's stored in plain text!).

Then we can (re) start apt-cacher-ng:

sudo systemctl enable apt-cacher-ng
sudo systemctl restart apt-cacher-ng

The last thing we need to do here is to punch a hole through the firewall, if required. As I explained in the previous post, I'm using a WireGuard mesh VPN, so I'm allowing all traffic on that interface (for reasons that will - eventually - come clear), so I don't need to open a separate hole in my firewall unless I want other devices on my network to use it too (which wouldn't be a bad idea, all things considered).

Anyway, ufw can be configured like so:

sudo ufw allow 3142/tcp comment apt-cacher-ng

With the apt-cacher server installed and configured, you can now get apt to use it:

echo 'Acquire::http { Proxy \"http://X.Y.Z.W:3142\"; }' | sudo tee -a /etc/apt/apt.conf.d/proxy

....replacing X.Y.Z.W with the IP address (or hostname!) of your apt-cacher-ng server. Note that it will get upset if you use https anywhere in your apt sources, so you'll have to inspect /etc/apt/sources.list and all the files in /etc/apt/sources.list.d/ manually and update them.

Automatic updates with unattended-upgrades

Next on the list is installing updates automatically. This is useful because we don't want to have to manually install updates every day on every node in the cluster. There are positives and negatives about installing updates - I recommend giving the top of this article a read.

First, we need to install unattended-upgrades:

sudo apt install unattended-upgrades

Then, we need to edit the /etc/apt/apt.conf.d/50unattended-upgrades file - don't forget to sudo.

Unfortunately, I haven't yet automated this process (or properly developed a replacement configuration place that can be automatically placed on a target system by a script), so for now we'll have to do this manually (the mssh command might come in handy).

First, find the line that starts with Unattended-Upgrade::Origins-Pattern, and uncomment the lines that end in -updates, label=Debian, label=Debian-Security. For Raspberry Pi users, add the following lines in that bit too:

"origin=Raspbian,codename=${distro_codename},label=Raspbian";

// Additionally, for those running Raspbian on a Raspberry Pi,
// match packages from the Raspberry Pi Foundation as well.
"origin=Raspberry Pi Foundation,codename=${distro_codename},label=Raspberry Pi Foundation";

unattended-upgrades will only install packages that are matched by a list of origins. Unfortunately, the way that you specify which updates to install is a total mess, and it's not obvious how to configure it. I did find an Ask Ubuntu answer that explains how to get unattended-upgrades to install updates. If anyone knows of a cleaner way of doing this - I'd love to know.

The other decision to make here is whether you'd like your hosts to automatically reboot. This could be disruptive, so only enable it if you're sure that it won't interrupt any long-running tasks.

To enable it, find the line that starts with Unattended-Upgrade::Automatic-Reboot and set it to true (uncommenting it if necessary). Then find the Unattended-Upgrade::Automatic-Reboot-Time setting and set it to a time of day you're ok with it rebooting at - e.g. 03:00 for 3am in the morning - but take care to avoid all your servers from rebooting at the same time, as this might cause issues later.

A few other settings need to be updated too. Here are they are, with their correct values:

APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";

Make sure you find the existing settings and update them, because otherwise if you just paste these in, they may get overridden. In short these settings:

  • Enable automatic updates to the package metadata indexes
  • Downloads upgradeable packages
  • Installs downloaded updates
  • Automatically cleans the cache up every 7 days

Once done, save and close that file. Finally, we need to enable and start the unattended-upgrades service:

sudo systemctl enable unattended-upgrades
sudo systemctl restart unattended-upgrades

To learn more about automatic upgrades, these guides might help shed some additional light on the subject:

Conclusion

In this post, we've taken a look at apt package caching and unattended-upgrades. In the future, I'm probably going to have to sit down and either find an alternative to unattended-upgrades that's easier to configure, or rewrite the entire configuration file and create my own version. Comments and suggestions on this are welcome in the comments.

In the next post, we'll be finally getting to the Hashicorp stack by installing and configuring Consul. Hold on to your hats - next week's post is significantly complicated.

Edit 2020-05-09: Add missing instructions on how to get apt to actually use an apt-cacher-ng server

Cluster, Part 4: Weaving Wormholes | Peer-to-Peer VPN with WireGuard

(Above: The WireGuard and wesher logos. Background photo: from Unsplash by Clint Adair.)

Hey - welcome back! Last week, we set Unbound up as our primary DNS server for our network. We also configured cluster member devices to use it for DNS resolution. As a recap, here are links to the all the posts in this series so far:

In this part, we're going to setup a WireGuard) peer-to-peer VPN. This is a good idea for several reasons:

  • It provides defence-in-depth
  • It protects non-encrypted / unprotected private services from the rest of the network

The latter point here is particularly important - especially if you've got other device on your network like I have. If you're somehow following along with this series with devices fancy enough to have multiple network interfaces, you can connect the 2nd network interface of every server to a separate switch, that doesn't connect to anywhere else. Don't forget that you'll need to setup a DHCP server on this new mini-network (or configure static IPs manually on each device, which I don't recommend) - but this is out-of-scope of this article.

In the absence of such an opportunity, a peer-to-peer VPN should do the trick just as well. We're going to be using WireGuard), which I discovered recently. It's very cool indeed - and it's apparently special enough to be merged directly into the Linux Kernel v5.6! It's been given high praise from security experts too.

What I like most is it's simplicity. It follows the UNIX Philosophy, and as such while it's very simple in the way it works, it can be used and applied in so many different ways to solve so many different problems.

With this in mind, let's get to installing it! If you're on an Ubuntu or Debian-based machine, then you should just be able to install it directly:

sudo apt install wireguard

Note that you might have to have your kernel's development headers installed if you experience issues. For Raspbian users (like me), installation is slightly more complicated. We'll need to setup the debian-backports apt repository to pull it in, since the Debian developers have backported it to the latest stable version of Debian (e.g. like a hotfix) - but Raspbian hasn't yet pulled it in. This is done in 2 steps:

# Add the debian-backports GPG key
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 04EE7237B7D453EC 648ACFD622F3D138
# Add the debian-backports apt repo
echo 'deb http://httpredir.debian.org/debian buster-backports main contrib non-free' | sudo tee /etc/apt/sources.list.d/debian-backports.list

Now, we should be able to update to download the new repository metadata:

sudo apt update

Next, we need to install the Raspberry Pi Linux kernel headers. Unlike other distributions (which use the linux-headers-generic package), this is done in slightly different way:

sudo apt install raspberrypi-kernel-headers

This might take a while. Once done, we can now install WireGuard itself:

sudo apt install wireguard

Again, this will take a while. Don't forget to pay close attention to the output - I've noticed that it's fond of throwing error messages, but not actually counting it as an error and ultimately claiming it completed successfully.

With this installed, we can now setup our WireGuard peer-to-peer VPN. WireGuard itself works on a public-private keypair per-device setup. A device first generates a keypair, and then the public key thereof needs copying to all other devices it wants to connect to. In this fashion, both a peer-to-peer setup (like we're after), and a client-server setup (like a more traditional VPN such as IPSec or OpenVPN) can be configured.

The overhead of configuring such a peer-to-peer WireGuard VPN is considerable though, since every time a device is added to the VPN every existing device needs updating to add the public key thereof to establish communications between the 2.

While researching an easier solution to the problem, I came across wesher, which does much of the heavy-lifting for you. It doe of course come at the cost of slightly reduced security (since the entire VPN network is protected by a single pre-shared key) and reduced configurability, but from my experiences so far it works like a charm for my purposes.

It is distributed as a single Go binary, that uses the Raft Consensus Algorithm (the same library that Nomad and Consul use actually) to manage cluster membership and provide self-healing properties. This makes is very easy for my to package into an apt package, which I've added to my apt repository.

The instructions to add my apt repository can be found on it's page, but here they are directly:

# Add the repository
echo "deb https://apt.starbeamrainbowlabs.com/ ./ # apt.starbeamrainbowlabs.com" | sudo tee /etc/apt/sources.list.d/sbrl.list
# Import the signing key
wget -q https://apt.starbeamrainbowlabs.com/aptosaurus.asc -O- | sudo apt-key add -
# Alternatively, import the signing key from a keyserver:
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys D48D801C6A66A5D8
# Update apt's cache
sudo apt update

Don't forget that if you're using a caching server like apt-cacher-ng (which we'll be setting up in the next post in this series), you'll probably want to change the https in the first command there to regular http in order for the caching to take effect. Note that this doesn't affect the security of the downloaded packages, since apt will verify the integrity and GPG signature of all packages downloaded. It only affects the privacy of the connection used to download the packages themselves (more on this in a future post).

Once setup, install wesher like so:

sudo apt install wesher

If you've got a systemd-based system (as I have sadly with Raspbian), I provide a systemd service file in a separate package:

sudo apt install wesher-systemd

Don't forget to perform these steps on all the machines you want to enter the cluster. Next, we need to configure our firewall. I'm using UFW - I hope you set up something similar when you first configured your servers you're clustering (I recommend UFW. Note also that this tutorial series will not cover the basics like this - instead, I'll link to other tutorials and such as I think of them). For UFW users, do this:

sudo ufw allow 7946 comment wesher-gossip
sudo ufw allow 51820/udp comment wesher-wireguard

Wesher requires 2 ports: 1 for the clustering traffic for managing cluster membership, and another for the WireGuard traffic itself. Now, we can initialise the cluster. This has to be done on an interactive terminal for the first time, to avoid leaking the cluster's pre-shared key to log files. Do it like this in a terminal:

sudo wesher

It should tell you that it's generated a new cluster key - it will save it in a private configuration directory. Save this somewhere safe, such as in your password manager. Now, you can press Ctrl + C to quit, and start the systemd service:

sudo systemctl enable --now wesher.service

It's perhaps good practice to check that the service has started successfully:

sudo systemctl status wesher.service

Having 1 node setup is nice, but not very useful. Adding additional nodes to the cluster is a bit different. Follow this tutorial up to and including the installation of the wesher and wesher-systemd packages, and then instead of just doing sudo wesher, do this instead:

sudo wesher --cluster-key CLUSTER_KEY_HERE --join IP_OF_ANOTHER_NODE --overlay-net 172.31.250.0/16 --log-level info

...replacing CLUSTER_KEY_HERE with your cluster key (don't forget to prefix the entire command with a space to avoid it from entering your shell history file), and IP_OF_ANOTHER_NODE with the IP address of another node in the cluster on the external untrusted network. Note that the --overlay-net there is required because of the way I wrote the systemd service file in the wesher-systemd package. I did this:

[Unit]
Description=wesher - wireguard mesh builder
After=network-online.target
After=syslog.target
After=rsyslog.service

[Service]
EnvironmentFile=-/etc/default/wesher
ExecStart=/usr/local/sbin/wesher --overlay-net 172.31.250.0/16 --log-level info
Restart=on-failure
Type=simple
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=wesher

[Install]
WantedBy = multi-user.target

I explicitly specify the subnet of the VPN network here to avoid clashes with other networks in the 10.0.0.0/8 range. I don't have a network in that range, but I know that others do. Unfortunately there's currently a known bug that means that IP address collisions may occur, and the cluster can't currently sort them out. So you need to use a pretty large subnet for now to avoid such collisions (here's hoping this one is patched soon).

Note that if you want to set additional configuration options, you can do so in /etc/default/wesher in the format VAR_NAME=VALUE - 1 per line. A full reference of the supported environment variables can be found here.

Anyway, once wesher has joined the cluster on the new node, press Ctrl + C to exit and then start and enable the systemd service as before (note that wesher saves all the configuration details to a configuration directory, so the cluster key doesn't need to be provided more than once):

sudo systemctl enable --now wesher.service
sudo systemctl status wesher.service

With this, you should have a fully-functional Wireguard peer-to-peer VPN setup with wesher. You can ask a node as to what IP address it has on the VPN by using the following command:

ip address

The IP address should be shown next to the wgoverlay network interface.

The last thing to do here is to configure our Firewall. In my case, I'm using UFW, so the instructions I include here will be specific to UFW (if you use another firewall, translate these commands for your firewall, and command below!).

In my specific case, I want to take the unusual step of allowing all traffic in on the VPN. The reason for this will become apparent in a future post, but in short Nomad dynamically allocates outward-facing ports for services. There's a good reason for this I'll get into at the time, but for now, this is how you'd do that:

sudo ufw allow in on wgoverlay

Of course, we can add override rules here that block traffic if we need to. Note that this only allows in all traffic on the wgoverlay network interface and not all network interfaces as my previous blog about UFW would have done - i.e. like this:

sudo ufw allow 1234/tcp

In the next part, we'll take a look at setting up an apt caching server to improve performance and reduce Internet data usage when downloading system updates.

Found this useful? Spotted a mistake? Confused about something? Comment below! It really helps motivate me to write these posts.

Document your network infrastructure with a wiki

A screenshot of my infrastructure wiki

Recently, I've realised that I actually manage quite an extensive network of interconnected devices. Some of them depend on each other in a non-obvious fashion, so after reading a comment on Reddit I decided to begin the process of documenting it all on a wiki.

If you've followed this blog for a while, then you'll probably know that I'm the author of a wiki engine called Pepperminty Wiki. If you're wondering, I'm still working away at it on and off - and have no plans to abandon it any time soon. After all, I have multiple instances of it setup - so I've got more reasons than 1 to keep developing it!

Anyway, I've recently setup a new instance of it to document the network infrastructure I maintain and manage. I'm currently going for 1 page per device, with sub-pages for the services they host. I tried putting the services under sub-headings on each device's page, but it quickly got cluttered and difficult to sort through.

Even if you just manage a Raspberry Pi acting as a server (I've got several of these myself actually), I do recommend looking into it. You'll save yourself so much time when asking questions like "what does this do?", "how does that work again?", and "where was it that I put the configuration file for that?".....

It's almost as if I'm speaking from experience :P

Don't forget to avoid putting passwords in a wiki though - as tempting as it sounds. As I recommend in my earlier post, Keepass 2 is my password manager of choice for that.

I might be a bit biased, but I can thoroughly recommend Pepperminty Wiki for this - and many other - purposes. If there's something missing, then don't hesitate to open an issue and I'll do my best to help you out :-)

Found this useful? Got different way you document stuff? Comment below! It really helps motivate me to write and program more.

Cluster, Part 3: Laying groundwork with Unbound as a DNS server

Looking up at the canopy of some trees

(Above: A picture from my wallpapers folder. If you took this, comment below and I'll credit you)

Welcome to another blog post about my cluster! Although I had to replace the ATX PSU I was planning on using to power the thing with a USB power supply instead, I've got all the Pis powered up and networked together now - so it's time to finally start on the really interesting bit!

In this post, I'm going to show you how to install Unbound as a DNS server. This will underpin the entire stack we're going to be setting up - and over the next few posts in this series, I'll take you through the entire process.

Starting with DNS is a logical choice, which comes with several benefits:

  • We get a minor performance boost by caching DNS queries locally
  • We get to configure our DNS server to make encrypted DNS-over-TLS queries on behalf of the entire network (even if some of the connected devices - e.g. phones don't support doing so)
  • If we later change our minds and want to shuffle around the IP addresses, it's not such a big deal as if we used IP addresses in all our config files

With this in mind, I starting with DNS before moving on to Docker and the Hashicorp stack:

A picture of the homepages of the Hashicorp stack

Before we begin, let's set out our goals:

  • We want a caching resolver, to avoid repeated requests across the Internet for the same query
  • We want to encrypt queries that leave the network via DNS-over-TLS
  • We want to be able to add our own custom DNS records for a domain of our choosing, for internal resolution only.

The last point there is particularly important. We want to resolve something like 172.16.230.100 to server1.bobsrockets.com internally, but not externally outside the network. This way we can include server1.bobsrockets.com in config files, and if the IP changes then we don't have to go back and edit all our config files - just reload or restart the relevant services.

Without further delay, let's start by installing unbound:

sudo apt install unbound

If you're on another system, translate this for your package manager. See this amazing wiki page for help on translating package manager commands :-)

Next, we need to write the config file. The default config file for me it located at /etc/unbound/unbound.conf:

# Unbound configuration file for Debian.
#
# See the unbound.conf(5) man page.
#
# See /usr/share/doc/unbound/examples/unbound.conf for a commented
# reference config file.
#
# The following line includes additional configuration files from the
# /etc/unbound/unbound.conf.d directory.
include: "/etc/unbound/unbound.conf.d/*.conf"

There's not a lot in the /etc/unbound/unbound.conf.d/ directory, so I'm going to be extending /etc/unbound/unbound.conf. First, we're going to define a section to get Unbound to forward requests via DNS-over-TLS:

forward-zone:
    name: "."
    # Cloudflare DNS
    forward-addr: 1.0.0.1@853
    # DNSlify - ref https://www.dnslify.com/services/resolver/
    forward-addr: 185.235.81.1@853
    forward-ssl-upstream: yes

The . name there simply means "everything". If you haven't seen it before, the fully-qualified domain name for seanssatellites.io for example is as follows:

seanssatellites.io.

Notice the extra trailing dot . there. That's really important, as it signifies the DNS root (not sure on it's technical name. Comment if you know it, and I'll update this). The io bit is the top-level domain (commonly abbreviated as TLD). seanssatellites is the actual domain bit that you buy.

It's a hierarchical structure, and apparently used to be inverted here in the UK before the formal standard was defined by the IETF (Internet Engineering Task Force) - of which RFC 1034 was a part.

Anyway, now that we've told Unbound to forward queries, the next order of business is to define a bunch of server settings to get it to behave the way we want it to:

server:
    interface: 0.0.0.0
    interface: ::0

    ip-freebind: yes

    # Access control - default is to deny everything apparently

    # The local network
    access-control: 172.16.230.0/24 allow
    # The docker interface
    access-control: 172.17.0.1/16 allow

    username: "unbound"

    harden-algo-downgrade: yes
    unwanted-reply-threshold: 10000000


    prefetch: yes

There's a lot going on here, so let's break it down.

Property Meaning
interface Tells unbound what interfaces to listen on. In this case I tell it to listen on all interfaces on both IPv4 and IPv6.
ip-freebind Tells it to try and listen on interfaces that aren't yet up. You probably don't need this, so you can remove it. I'm including it here because I'm currently unsure whether unbound will start before docker, which I'm going to be using extensively. In the future I'll probably test this and remove this directive.
access-control unbound has an access control system, which functions rather like a firewall from what I can tell. I haven't had the time yet to experiment (you'll be seeing that a lot), but once I've got my core cluster up and running I intend to experiment and play with it extensively, so expect more in the future from this.
username The name of the user on the system that unbound should run as.
harden-algo-downgrade Protect against downgrade attacks when making encrypted connections. For some reason the default is to set this to no, so I enable it here.
unwanted-reply-threshold Another security-hardening directive. If this many DNS replies are received that unbound didn't ask for, then take protective actions such as emptying the cache just in case of a DNS cache poisoning attack
prefetch Causes unbound to prefetch updated DNS records for cache entries that are about to expire. Should improve performance slightly.

If you have a flaky Internet connection, you can also get Unbound to return stale DNS cache entries if it can't reach the remote DNS server. Do that like this:

server:
    # Service expired cached responses, but only after a failed 
    # attempt to fetch from upstream, and 10 seconds after 
    # expiration. Retry every 10s to see if we can get a
    # response from upstream.
    serve-expired: yes
    serve-expired-ttl: 10
    serve-expired-ttl-reset: yes

With this, we should have a fully-functional DNS server. Enable it to start on boot and (re)start it now:

sudo systemctl enable unbound.service
sudo systemctl restart unbound.service

If it's not already started, the restart action will start it.

Internal DNS records

If you're like me and want some custom DNS records, then continue reading. Unbound has a pretty nifty way of declaring custom internal DNS records. Let's enable that now. First, you'll need a domain name that you want to return custom internal DNS records for. I recommend buying one - don't use an unregistered one, just in case someone else comes along and registers it.

Gandi are a pretty cool registrar - I can recommend them. Cloudflare are also cool, but they don't allow you to register several years at once yet - so they are probably best set as the name servers for your domain (that's a free service), leaving your domain name with a registrar like Gandi.

To return custom DNS records for a domain name, we need to tell unbound that it may contain private DNS records. Let's do that now:

server:
    private-domain: "mooncarrot.space"

This of course goes in /etc/unbound/unbound.conf, as before. See the bottom of this post for the completed configuration file.

Next, we need to define some DNS records:

server:
    local-zone: "mooncarrot.space." transparent
    local-data: "controller.mooncarrot.space.   IN A 172.16.230.100"
    local-data: "piano.mooncarrot.space.        IN A 172.16.230.101"
    local-data: "harpsichord.mooncarrot.space.  IN A 172.16.230.102"
    local-data: "saxophone.mooncarrot.space.    IN A 172.16.230.103"
    local-data: "bag.mooncarrot.space.          IN A 172.16.230.104"

    local-data-ptr: "172.16.230.100 controller.mooncarrot.space."
    local-data-ptr: "172.16.230.101 piano.mooncarrot.space."
    local-data-ptr: "172.16.230.102 harpsichord.mooncarrot.space."
    local-data-ptr: "172.16.230.103 saxophone.mooncarrot.space."
    local-data-ptr: "172.16.230.104 bag.mooncarrot.space."

The local-zone directive tells it that we're defining custom DNS records for the given domain name. The transparent bit tells it that if it can't resolve using the custom records, to forward it out to the Internet instead. Other interesting values include:

Value Meaning
deny Serve local data (if any), otherwise drop the query.
refuse Serve local data (if any), otherwise reply with an error.
static Serve local data, otherwise reply with an nxdomain or nodata answer (similar to the reponses you'd expect from a DNS server that's authoritative for the domain).
transparent Respond with local data, but resolve other queries normally if the answer isn't found locally.
redirect serves the zone data for any subdomain in the zone.
inform The same as transparent, but logs client IP addresses
inform_deny Drops queries and logs client IP addresses

Adapted from /usr/share/doc/unbound/examples/unbound.conf, the example Unbound configuration file.

Others exist too if you need even more control, like always_refuse (which always responds with an error message).

The local-data directives define the custom DNS records we want Unbound to return, in DNS records syntax (again, if there's an official name for the syntax leave a comment below). The local-data-ptr directive is a shortcut for defining PTR, or reverse DNS records - which resolve IP addresses to their respective domain names (useful for various things, but also commonly used as a step to verify email servers - comment below and I'll blog on lots of other shady and not so shady techniques used here).

With that, our configuration file is complete. Here's the full configuration file in it's entirety:

# Unbound configuration file for Debian.
#
# See the unbound.conf(5) man page.
#
# See /usr/share/doc/unbound/examples/unbound.conf for a commented
# reference config file.
#
# The following line includes additional configuration files from the
# /etc/unbound/unbound.conf.d directory.
include: "/etc/unbound/unbound.conf.d/*.conf"

server:
    interface: 0.0.0.0
    interface: ::0

    ip-freebind: yes

    # Access control - default is to deny everything apparently

    # The local network
    access-control: 172.16.230.0/24 allow
    # The docker interface
    access-control: 172.17.0.1/16 allow

    username: "unbound"

    harden-algo-downgrade: yes
    unwanted-reply-threshold: 10000000

    private-domain: "mooncarrot.space"

    prefetch: yes

    # ?????? https://www.internic.net/domain/named.cache

    # Service expired cached responses, but only after a failed 
    # attempt to fetch from upstream, and 10 seconds after 
    # expiration. Retry every 10s to see if we can get a
    # response from upstream.
    serve-expired: yes
    serve-expired-ttl: 10
    serve-expired-ttl-reset: yes

    local-zone: "mooncarrot.space." transparent
    local-data: "controller.mooncarrot.space.   IN A 172.16.230.100"
    local-data: "piano.mooncarrot.space.        IN A 172.16.230.101"
    local-data: "harpsichord.mooncarrot.space.  IN A 172.16.230.102"
    local-data: "saxophone.mooncarrot.space.    IN A 172.16.230.103"
    local-data: "bag.mooncarrot.space.          IN A 172.16.230.104"

    local-data-ptr: "172.16.230.100 controller.mooncarrot.space."
    local-data-ptr: "172.16.230.101 piano.mooncarrot.space."
    local-data-ptr: "172.16.230.102 harpsichord.mooncarrot.space."
    local-data-ptr: "172.16.230.103 saxophone.mooncarrot.space."
    local-data-ptr: "172.16.230.104 bag.mooncarrot.space."

    fast-server-permil: 500

forward-zone:
    name: "."
    # Cloudflare DNS
    forward-addr: 1.0.0.1@853
    # DNSlify - ref https://www.dnslify.com/services/resolver/
    forward-addr: 185.235.81.1@853
    forward-ssl-upstream: yes

Where do we go from here?

Remember that it's important to not just copy and paste a configuration file, but to understand what every single line of it does.

In a future post in this series, we'll be revising this to forward requests for *.service.mooncarrot.space to Consul, a clustered service discovery system that keeps track of what is running where and presents a DNS server as an interface (there are others).

In the next post, we'll (probably) be looking at setting up Consul - unless something else happens first :P Nomad should be next after that, followed closely by Vault.

Once I've got all that set up (which is a blog post series in and of itself!), I'll then look at encrypting all communications between all nodes in the cluster. After that, we'll (finally?) get to Docker and my plans there. Other things include an apt and apk (the Alpine Linux package manager) caching servers - which will have to be tackled separately.

Oh yeah, and I want to explore Traefik, which is apparently like Nginx, but intended for containerised infrastructure.

All this is definitely going to keep me very busy!

Found this interesting? Got a suggestion? Comment below!

Sources and Further Reading

Measuring maximum RAM usage with Bash on Linux

While running a simulation on my University's Viper HPC, I found that I needed to measure the maximum RAM usage of a simulation that I was running. Since the solution wasn't particularly easy to find, I thought I'd quickly blog about it here.

Doing this isn't actually as easy as you might think. In the end, I used this:

/usr/bin/time --format 'Max RAM working set size: %Mk' command_here --foo bar

....replacing command_here --foo bar with the command you want to measure.

/usr/bin/time is a program that measures how long a command takes to execute, but it's evident that it measures a bunch of other different things as well. While the output format leaves something to be desired (hence the --format in the above), it does the job pretty well.

Note that /usr/bin/time is distinct from the time built-in you get in Bash. Depending on your shell, you may need to explicitly specify the full path to the time binary. In addition, if your system doesn't have the command (like Viper), you may need to copy it from another system that does that has the same CPU architecture.

I forget where it was that I found this solution, but if you comment below then I'll add the credit to this post if your post looks familiar.

As a quick extra, you can limit the time a command can execute for like this:

timeout 60 command_here

That will limit command_here to executing for only 60 seconds.

Found this interesting? Comment below!

Happy Easter 2020!

Some lovely flowers at University last year.

Happy Easter 2020! I hope everyone has a peaceful day.

Since I'm currently unable to obtain a nice picture for this holiday blog post, here's one from the archives instead. I took it at University last spring. They have always had such amazing flowerbeds - especially in the spring. You can really smell the flower, and makes the experience of going into University that much more peaceful.

In terms of my plans on here, I've got a number of interesting posts coming up. I discovered a wonderfully easy way to measure the maximum RAM usage of a process in Bash which I've blogged about, and I've finally got all my ducks in a row and powered up my Raspberry Pi cluster for the first time - so expect regular blog posts about my progress configuring that!

Also, I compressed the above image with mozjpeg, a JPEG encoder for the web by Mozilla. Since I couldn't find a command-line tool for it and can't be bothered to implement one for just a single picture, I used this website to compress it.

How to hash and sign files with GPG and a bit of Bash

When making a release of your software or sending some important documents, it's pretty common practice (especially amongst larger projects) to distribute hashes and GPG signatures along with release binaries themselves. Example projects that do this include:

This is great practice, since it allows downloaders to verify that their download has not been corrupted, and that it was you who released them and not some imposter.

In this post, I'm going to outline how you can do this too.

I've recently both verified a number of signatures and generated some of my own too, so I thought I'd post about it here to show others how to do it too.

Verification

Before we get into the generation of hashes and signatures, we should first talk about verifying them. I'm mentioned this is good practice already, so it makes sense to briefly talk about verification first. First, let's download Nomad version 0.10.5. You can grab the files here. Download the following files:

  • nomad_0.10.5_SHA256SUMS - The hashes themselves
  • nomad_0.10.5_SHA256SUMS.sig - The GPG signature of the above file
  • nomad_0.10.5_linux_arm.zip - Nomad itself, for the ARM architecture (feel free to pick whichever one you like and adapt these instructions accordingly)

Verifying the hashes is easier, so let's do that first. We can see from the filenames that we have SHA 256 hashes, so we'll want the sha256sum command. Windows users will need to use the Windows Subsystem for Linux, or setup an msys environment:

sha256sum --ignore-missing --check nomad_0.10.5_SHA256SUMS

It should output an OK message and return an exit code zero (to check this in a script, you can do echo "$?" directly after running it to check the exit code). 99% of the time this check will succeed, but you'll be glad that you checked the 1% of the time it fails.

Checking the hashes here is the bit that ensures that the files haven't been corrupted. Next, we'll verify the GPG signature. This is the bit that ensures that the files we've downloaded were actually originally released by who we think they were. Do that like this:

gpg --verify nomad_0.10.5_SHA256SUMS.sig

Doing this, you may get an error message telling you that it can't verify the signature because you haven't got the public key of the signer imported into your keyring. To remedy this, look for the bit that tells you the key id (e.g. using RSA key 51852D87348FFC4C). Copy it, and then do this:

gpg --recv-keys 51852D87348FFC4C

This will download it and import the key id into your local GPG keyring. Then re-run the gpg --verify command above, and it should work.

Generation

Now that we know how to verify a signature, let's generate our own. First, put the files you want to hash into a directory and cd into it in your terminal. Then, let's generate the hash file:

# Hash files
find . -type f -not -name "*.SHA256" -print0 | xargs -0 -n1 -I{} -P"$(nproc)" sha256sum -b "{}" >HASHES.SHA256

We use find here to locate all the files (other than the hash file itself), and then pass them to xargs, which calls sha256sum to hash the files in question. Finally, we write the hashes to HASHES.SHA256.

Next, lets generate a GPG signature for the hash file. For this, you'll need a GPG key. That's out of scope of this post really, but this tutorial looks like it will show you how to do it. Note that in order for other people to verify the GPG signature you create, you'll probably need to upload your GPG public key to a keyserver (the article I link to shows you how to do this too).

Once done, generate the signature like this:

# Sign the hashes
gpg --sign --detach-sign --armor HASHES.SHA256

Specifically, we generate a detached signature here - meaning that it's in a separate file to the file that is being signed. The --armor there just means to wrap the signature in base64 encoding (I'm pretty sure) and some text so that it's not a raw binary file that might confuse the uninitiated.

Finally, let's verify the signature we just created - just in case:

# Verify the signature, and check we used the right key
gpg --verify HASHES.SHA256.asc

If all's good, it should tell you that the signature is ok. If you've got multiple keys, ensure that you signed it with the correct key here. GPG will sign things with the key you have marked as the default key.

Found this interesting? Having trouble? Want to say hi? Comment below!

PhD Update 3: Simulating simulations with some success

Hey there! Welcome to another PhD update blog post. The last time I posted, I was still working away at getting the rainfall radar data downloader working as intended.

Thankfully, since then I've managed to get it to complete (wow, that too much longer than expected) - and I've now turned my attention to running the physics-based simulation, and beginning to implement the AI(s) that will (hopefully) implicitly learn the parameters of the model in question.

Physics-based simulation patching

Getting to this point, as you might imagine, wasn't quite as straight-forward as I initially thought. The physics-based model I'm (currently) using is HAIL-CAESAR, a (supposedly) high-performance version of CAESAR-Lisflood (yes, it's SourceForge shudder). Unfortunately, the format for the rainfall data that it takes in is especially space inefficient - after writing a converter, I found that my 4.5GiB compress JSON stream files (1 per year) would have turned into about 66GB of uncompressed ASCII! Theoretically speaking by my calculations. I don't have that much disk space free - so clearly another approach is in order.

This approach I speak of is convincing HAIL-CAESAR to take the data in via the standard input. I initially tried using a FIFO (also known as a named pipe), but I ran into this bug in Node.js.

HAIL-CAESAR by default doesn't support taking the data in on the standard input though, so I had to patch HAIL-CAESAR to add support. I did this by getting it to interpret the filename - to mean "use the standard input instead", which from my previous experiences seem to be an unofficial convention that lots of other programs follow. Perhaps at some point soon I should consider contributing my patch back to HAIL-CAESAR for others to enjoy.

Heightmap tweaking

With that sorted, I also had to mess around with the heightmap (I got this through my University's "Digimap" service thingy) I obtained to get it to be precisely the same size as the rainfall radar data I have.

It turned out that the service I got the heightmap from isn't smart enough to give you precisely the bit you asked for - instead giving you just the tiles that overlap the area you specify. In the end I found myself with ~170 separate tiles (some of which I had to get after the fact because I found I was missing some) - so I ended up implementing a program to stitch them all back together again.

That program ended up turning out much more complete as a separate whole than I thought it would. I'm pretty sure that these heightmap files I've been dealing with are in a standard format, but I'm not aware of its name (if you know, I'd love to hear from you - post a comment below!). It's for these reasons that I ended up releasing it as a pair of packages on npm.

You can find them here:

  • terrain50 - OS Terrain 50 manipulation library
  • terrain50-cli - Command-line interface for the above to make it easy to manipulate heightmaps from the command line

I'll probably make a separate blog post about them at some point soon. For the curious, the API docs (there's a link in the README of the library package too) are automatically updated with my Laminar CI setup :D

Tensor trouble

It is with a considerable amount of anticipation that I'm finally reaching the really interesting part of this experiment. This week, I've started work on implementing a Temporal Convolutional Neural Network (see also this paper). A Temporal CNN is a network type I discovered recently that takes advantage of multiple 3-dimensional CNN layers to allow a CNN-based model to learn temporal-based relationships in a dataset.

I'm not sure how well it's going to work on my particular dataset, given that the existing papers I've found on it use it for classification-based tasks, but I'm pretty hopeful that, with some tweaking, it should perform pretty well. While I haven't yet finished writing up the dataset input logic, I have implemented the core model using the Tensorflow.js layers API:

asciicast

In the end I've decided to give Tensorflow.js another go (I don't think I mentioned it, but it attempted to use it for my Master's summer project, but it didn't work out so well), since I realised that I've implemented a good portion of the data processing code in Javascript (Node.js) already (as mentioned above). Interestingly, HAIL-CAESAR spits out files in the same format as the heightmap I've been working with, which makes processing even easier!

What's next

From here, I intend to finish up my Temporal CNN implementation and get it running on the data I have so far from the HAIL-CAESAR model (which isn't unfortunately a lot - so far I've only got ~8K 5-minute time-steps worth of output which, if I'm calculating correctly, is just 29 days worth of simulation). I'm probably going to have to swap HAIL-CAESAR out at some point though, because it's really slow. Or perhaps I just don't know how to use it properly (maybe I should find someone more experienced with it and ask them first).

Anyway, I'm also going to try implementing a model inspired by the Google rainfall radar nowcasting paper I mentioned in my last post in this series. With both of these implemented, I can start to compare them and see which one is better suited for the task of flood prediction. I might even implement the Grid LSTM model I saw too.

In addition, I have my PhD panel 1 review coming up soon too - so apparently I've got a list of things I need to do to prepare for that - including writing a ~5K word report. I'll probably do this pretty soon - I don't want to be rushing it at the last minute.

Found this interesting? Got a suggestion? Want to say hi? Comment below!

Cluster, Part 2: Grand Designs

In the last part of this series, I talked about my plans for building an ARM-based cluster, because I'm growing out of the Raspberry Pi 3B+ I currently have at home. Since then, I have decided to focus on the compute cluster first, as I have a reasonable amount of room left on the 1tB WD Pidrive I have attached to my existing Raspberry Pi 3B+.

Hardware

To this end, I have been busy ordering parts and organising things to get construction of the compute cluster side of things going. The most important part of the whole cluster is the compute boards themselves. I've decided to go with 4 x Raspberry Pi 4s with 4GB RAM each for the worker nodes, and 1 x Raspberry Pi 4 with 2GB of RAM as the controller (it would have been a 1GB RAM model, but a recent announcement changed my mind :D):

(Above: The Raspberry Pi 4s I'm going to be using. The colourful heatsink cases there are to dissipate heat passively if possible and reduce the need for the fan to run as often. The one with the smaller red heatsink is the controller node - I don't anticipate the load on that node being high enough to need a bigger more expensive heatsink)

My reasoning for Raspberry Pis is software support. They are hugely popular - and from experience I can tell that they are pretty well supported on the software side of things. Issues with hardware features not being supported by the operating system are minimal - and where issues do arise they are more often than not sorted out. Regular kernel security updates are also provided - something that isn't always a thing with Linux distributions for other boards I've noticed.

Although the nodes in the cluster are very important, they are far from the only component I'll need. I'll also need a way to power it - which I've settled on an using a desktop ATX power supply (generously donated by University).

(Above: The ATX power supply, with a few wires cut and other bits and bobs attached. As of this blog post I'm in the middle of wiring it up, so I haven't finished it yet)

This adds some additional complications though, because wiring an ATX power supply up to a fleet of Raspberry Pi 4s isn't as easy as it sounds. To do that, I've decided to wire the 5V and ground wires up to 5 USB type-a breakout boards, with a 3 amp self-resettable fuse on each live (red) wire. Then I can use 5 short type-a to type-c converter cables to power the Raspberry Pi 4s.

(Above: The extra bits and bobs laid out that I'll be using to wire the ATX power supply up to the USB type-a breakout boards. From left to right: 3A self-resettable fuses, 18 AWG wire, Wagos, header pins, and finally the USB type-a breakout boards themselves)

With power to the Raspberry Pis, the core compute hardware is in place. I still need a bunch of things around the edges though, such as a (very quiet) fan to keep it cool:

(Above: A Noctua NF-P14s redux-1200)

I found this particular fan on quietpc.com. While their prices and shipping are somewhat expensive (I didn't actually buy it from there - I got a better deal on Amazon instead), they are a great place to look into the different options available for really quiet fans. I'm pretty sensitive to noise, so having a quiet fan is an important part of my cluster design.

This one is the large 14cm model, so that it fits in front of all 5 Raspberry Pis if they are stood up on their sides and stacked horizontally. It takes 12 volts, so I'll be connecting it to the 12V rail from the ATX power supply. The fan speed is also controllable via PWM (pulse-width modulation), so I plan on using an Arduino (probably one of the Arduino Unos I've got lying around) to control it and present a serial interface or something to the Raspberry Pi that's acting as the controller node in the cluster.

Lastly, another extremely important part of any cluster is a solid switch. Without a great switch at the base of the network, you'll have all sorts of connection issues and the performance of the cluster will be degraded significantly. I'm anticipating that I'll want to transfer significant amounts of data around very quickly (e.g. Docker container images, and later large blocks of data during a storage cluster rebalance).

For this reason, I've bought myself a Netgear GS116v2. While its unmanaged, I can't currently afford a more expensive managed switch at this time. It is however gigabit and also has an array of other features such as energy efficient ethernet (802.3az), full duplex gigabit (i.e. 32GB bandwidth available to all ports, which is enough for all ports to be transmitting and receiving gigabit at the same time), and a silent fanless design.

My Netgear GS116v2

(Above: The switch I'll be using. I watched eBay and got it used for much less than it's available new)

Networking

Hardware isn't the only thing I've been thinking about. While I've been waiting for packages to arrive, I've also been planning out the software I'm going to use and how I'm going to network all my Pis together.

My plans on the networking side of things are subject to significant change depending on how many responsibilities I can convince my home router to give up, but I have drawn up a network diagram showing what I'm currently aiming towards:

An ideal-case scenario network diagram. Explained below.

The cluster is represented on the left half of the diagram. This will probably entail some considerable persuasion of my router to pull off, but a quick look reveals that it's (probably) possible with some trial-and-error.

The idea is that I have a separate subnet for the cluster than the rest of the home network. Then I can do strange stuff and fiddle with it (hopefully) without affecting everyone else on the network.

Software

Meanwhile, out of all the different aspects of building this cluster I've got the clearest picture as to the software I'm going to be using.

I've decided that I'm going to use a container-based system. I've looked at a number of different options (such as podman and Singularity) - but I'm currently of the opinion that Docker is the most suitable option for what I'm going for. It's not as enterprisey as Singularity, and it seems to be more mature than podman. It also has a huge library of prebuilt containers too - but for learning purposes I'm going to be writing almost all my container scripts from scratch - probably using some sort of Alpine Linux container as a base. If I ever run into a situation where Docker isn't suitable and I need something closer to a VM, I'll probably use LXC, which I believe sits on top of the same underlying container runtime that Docker does.

I'm anticipating that container-based tech is going to be great for managing the stuff that's running on my cluster - so you can expect more posts that go into some depth about how it all works and how I'm setting my system up in the future.

To complement my container-based tech, I'm also going to be using a workload orchestrator. The Viper High-Performance Computer I've recently gained access to has lots of nodes in it and uses Slurm for workload orchestration, but that seems more geared towards environments that have lots of jobs that each have a defined running time. Great for scientific simulations and other such things, but not so great for personal self-hosted applications and the like.

Instead, I'm probably going to use Nomad. It looks seriously cool, and an initial look at the documentation reveals that it's probably going to be much simpler easier to understand than Kubernetes (see also), which seems to be the other competing software in the business. It also seems to integrate well with other programs done by the same company (Hashicorp) like Consul for service networking management (I'm hoping I can get DNS resolution for the services running on the cluster under control with it) and Vault for secret management (e.g. API keys, passwords, and other miscellaneous secrets) - all of which I'm going to install and experiment with (expect more on that soon).

All of those for now will be backed by an NFS share on all nodes in the cluster for the persistent volumes attached to running containers.

On the controller node I mentioned earlier I'm also going to be running a few extra items to aid in the management of the cluster:

  • A Docker registry, from which the worker nodes will be pulling containers for execution (worker nodes will not have access to the public Docker registry at hub.docker.com)
  • An apt caching proxy - probably apt-cacher-ng. Since all the nodes in the cluster are going to be using the same OS, have the same packages installed, and the same configuration settings etc, it doesn't make much sense for them to be downloading apt packages from the Internet every time - so I'll be caching them locally on the controller node
  • Potentially some sort of reverse proxy that sits in front of all the services running on the cluster, but I haven't decided on how this will fit into the larger puzzle just yet (more research is required). I'm already very familiar with Nginx, but I've seen Traefik recommended for dynamic container-based setups, so I'm going to investigate that too.

That about covers my high-level design ideas. As of the time of typing, the next thing I need to do is organise a case for it all to go in, fix the loose connections in the screw terminals (not pictured; they arrived after I took the pictures), and then find a place to put it....

Testing storage devices with f3

Some microSD cards (Above: Some microSD cards. Thankfully none of these are fake, but you never know.....)

Always test storage devices after you buy them. I don't just mean check to see if they work (though that's a good idea too), but also that they can actually store the amount of stuff that they advertise they can.

Recently, I bought myself 5 64GB microSD cards for my cluster (more on this very soon in a future blog post!). The first thing did when I got them was test them to make sure that they could actually store 64GB of stuff. My tool of choice was f3, which stands for Fight Flash Fraud or Fight Fake Flash. I'm glad I did - because 3 of them turned out to be faulty. 2 of them were actually 32GB cards in disguise, and 1 of them wouldn't mount at all.

While this might be my first experience with fake or fault storage devices, it's hardly an uncommon occurrence. Everything from microSD cards to flash drives - and even regular hard drives! - may be faulty upon arrival - or worse appear fine at first, and then a few months down the line start corrupting random data for no reason.

f3 is a suite of tools for testing storage devices to make sure they function properly. They work best as a destructive test - i.e. one that destroys existing data on the disk - so if you've got some data on the target disk you want to test, now is the time to back it up (hopefully this is something you've been doing already - more on that in another post if there's the demand).

f3 consists of 3 principle tools:

  • f3probe, which runs a fast test to check for issues (sadly I couldn't get this to work reliably)
  • f3write, which fills a disk with test files
  • f3read, which reads the test files back from disk and validates them

It's a real shame that I can't get f3probe to work reliably. Maybe at some point I'll implement my own version that writes data to every nth block of a device to test it more quickly than the f3write/f3read mechanism I'll explain below (if anyone knows of a better tool that works on Linux, please let comment below!)

To test a device, you first need to write the test files to it. I've taken to reformatting the device as ext4 (the Linux filesystem) first:

sudo umount /dev/sdXY; # Unmount it if it's currently mounted
sudo mkfs.ext4 /dev/sdXY; # Format it to ext4

....where /dev/sdXY is the partition you want to format. This isn't mandatory, but it is a quick way of making sure a disk is empty.

Next, we need to write the test files to the device. If it isn't already, you'll need to mount it first. This can be done like so:

# If it's not mounted automatically:
sudo mkdir /media/YOUR_USERNAME_HERE/SOME_NAME_HERE;
sudo mount /dev/sdXY /media/YOUR_USERNAME_HERE/SOME_NAME_HERE;
f3write /media/YOUR_USERNAME_HERE/SOME_NAME_HERE

This might take a while - don't forget to replace the paths there with those specific to your setup. With the test files written to the disk, we need to read them back again to make sure they are valid:

f3read /media/YOUR_USERNAME_HERE/SOME_NAME_HERE

This will read them all back again, and then print a summary report at the bottom to tell you what it found. Ideally, it should show a big number of blocks as succeeded, and no blocks in any of the other failure categories.

Running multiple commands like this is effort though, so surely we can do better than this. With some simple shell scripting, we can run both commands at once:

location=/media/YOUR_USERNAME_HERE/SOME_NAME_HERE; f3write "${location}"; && f3read "${location}"; alert

If you're on a machine with a graphical desktop, then the ; alert bit on the end should generate a desktop notification when it's done. For other users (e.g. over SSH), this should be removed. Just in case you have a graphical desktop (e.g. Ubuntu Desktop) and the alert bit doesn't work for you, append this to your ~/.bashrc file and restart your terminal:

# Add an "alert" alias for long running commands.  Use like so:
#   sleep 10; alert
alias alert='notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$(history|tail -n1|sed -e '\''s/^\s*[0-9]\+\s*//;s/[;&|]\s*alert$//'\'')"'

....I forget where this is from exactly.

If you're not likely to be at your computer when it finishes, then there's still something you can do. Personally I use XMPP for personal messaging, so I thought it would be great if I could get a notification when it was done. Since I've already written xmppbridge for easily sending XMPP messages from the terminal, it was pretty trivial to write a shell script for my bin folder that would send my a message when the process was complete:

#!/usr/bin/env bash

# f3test: Runs f3 on the current directory.
# 
# Usage:
#     f3test "alerts@xmpp.example.com"
# 

destination="$1";

f3write .;
f3read .;

echo "Card testing complete in ${SECONDS}s" | xmppbridge --groupchat --destination "${destination}";

I called this script f3test, and put it in my ~/bin folder. To use it, first cd to the root of the device you want to test (`` in the above examples), and then set a pair of environment variables to let it know how to login to an XMPP account to send a message:

export XMPP_JID="someone@bobsrockets.com"; # The JID to login with.
export XMPP_PASSWORD="weN33dM0reBoost3rs"; # The password to use when logging in

...remove the --groupchat in the script if it's not a groupchat you want it to send a message to (I have a personal group chat that's just between me and various bots that notify me about various aspect of the systems I manage). If you don't have an XMPP account yet, you can get one at any public server in the XMPP directory, or run your own (see also snikket, which is a distribution of Prosody that's designed to be extremely easy to setup & run)!

Of course, you could just as easily swap the xmppbridge call there with a different command to send a message via a different channel. For example mailx can send emails.

Found this interesting? Got a better tool? Need some help? Comment below!

Art by Mythdael