Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Making an auto-updated downmuxed copy of my music

I like to buy and own music. That way, if the service goes down, I still get to keep both my music and the rights thereto that I've paid for.

To this end, I maintain an offline collection of music tracks that I've purchased digitally. Recently, it's been growing quite large (~15GiB at the moment) - which is quite a bit of disk space. While this doesn't matter too much on my laptop, on my phone it's quite a different story.

For this reason, I wanted to keep a downmuxed copy of my music collection on my Raspberry Pi 3B+ file server that I can sync to my phone. Said Raspberry Pi already has ffmpeg installed, so I decided to write a script to automate the process. In this blog post, I'm going to walk you through the script itself and what it does - and how you can use it too.

I've decided on a standard downmuxed format of 256kbps MP3. You can choose anything you like - you just need to tweak the appropriate lines in the script.

First, let's outline what we want it to do:

  1. Convert anything that isn't an mp3 to 256kbps mp3 (e.g. ogg, flac)
  2. Downmux mp3 files that are at a bitrate higher than 256kbps
  3. Leave mp3s that are at a bitrate lower than (or equal to) 256kbps alone
  4. Convert and optimise album art to 256x256
  5. Copy any unknown files as-is
  6. If the file exists in the target directory already, don't re-convert it again
  7. Max out the system resources when downmuxing to get it done as fast as possible

With this in mind, let's start outlining a script:

#!/usr/bin/env bash

input="${DIR_INPUT:-/absolute/path/to/Music}"
output="${DIR_OUTPUT:-/absolute/path/to/Music-Portable}"

export input;
export output;

temp_dir="$(mktemp --tmpdir -d "portable-music-copy-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

# Library functions go here

# $1    filename
process_file() {
    filename="${1}";
    extension="${filename##*.}";

    # Process file here
}

export temp_dir;
export -f process_file;

cd "${input}" || { echo "Error: Failed to cd to input directory"; exit 1; };
find  -type f -print0 | nice -n20 xargs -P "$(nproc)" -0 -n1 -I{} bash -c 'process_file "{}"';

# Cleanup here....

Very cool. At the top of the script, we define the input and output directories we're going to work on. We use the ${VARIABLE_NAME:-default_value} syntax to allow for changing the input and output directories on the fly with the DIR_INPUT and DIR_OUTPUT environment variables.

Next, we create a temporary directory, and define an exit trap to ensure it gets deleted when the script exits (regardless of whether the exit is clean or not).

Then, we define the main driver function that will process a single file. This is called by xargs a little further down - which takes the file list in from a find call. The cd is important there, because we want the file paths from find to be relative to the input directory for easier mangling later. The actual process_file call is wrapped in bash -c '', because being a bash function it can't be called by xargs directly - so we have to export -f it and wrap it as shown.

Next, we need to write some functions to handle converting different file types. First, let's write a simple copy function:

# $1    Source
# $2    Target
do_copy() {
    source="${1}";
    target="${2}";

    echo -n "cp ";
    cp "${source}" "${target}";
}

All it does is call cp, but it's nice to abstract like this so that if we wanted to add extra features (e.g. uploading via sftp or something) later, it's not as much of a bother.

We also need to downmux audio files and convert them to mp3. Let's write a function for that too:


# $1    Source
# $2    Target
do_downmux() {
    source="${1}";
    target="${2}";

    set +e;
    ffmpeg -hide_banner -loglevel warning -nostats -i "${source}" -vn -ar 44100 -b:a 256k -f mp3 "${target}";
    exit_code="${?}";
    if [[ "${exit_code}" -ne 0 ]] && [[ -f "${target}" ]]; then
        rm "${target}";
    fi
    return "${exit_code}";
}

It's got the same arguments signature as do_copy, but it downmuxes instead of copying directly. The line that does the magic is highlighted. It looks complicated, but it's actually pretty logical. Let's break down all those arguments:

Argument Purpose
-hide_banner Hides the really rather wordy banner at the top when ffmpeg starts up
-loglevel warning Hides everything but warning messages to avoid too much unreadable output when converting many tracks at once
-nostats As above
-i "${source}" Specifies the input file
-vn Strips any video tracks found
-ar 44100 Force the sampling rate to 44.1KHz, just in case it's sampled higher
-b:a 256k Sets the output bitrate to 256kbps (change this bit if you like)
-f mp3 Output as mp3
"${target}" Write the output to the target location

That's not so bad, right? After calling it, we also need to capture the exit code. If it's not 0, then ffmpeg encountered some kind of issue. If so, we delete any output files it creates and return the same exit code - which we handle elsewhere.

Finally, we need a function to optimise images. For this I'm using optipng and jpegoptim to handle optimising JPEGs and PNGs respectively, and ImageMagick for the resizing operation.


# $1    Source
compress_image() {
    source="${1}";

    temp_file_png="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.png)";
    temp_file_jpeg="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.jpeg)";

    convert "${source}" -resize 256x256\> "${temp_file_jpeg}" >&2 &
    convert "${source}" -resize 256x256\> "${temp_file_png}" >&2 &
    wait

    jpegoptim --quiet --all-progressive --preserve "${temp_file_jpeg}" >&2 &
    optipng -quiet -fix -preserve "${temp_file_png}" >&2 &
    wait

    read -r size_png _ < <(wc --bytes "${temp_file_png}");
    read -r size_jpeg _ < <(wc --bytes "${temp_file_jpeg}");
    if [[ "${size_png}" -gt "${size_jpeg}" ]]; then
        # JPEG is smaller
        rm -rf "${temp_file_png}";
        echo "${temp_file_jpeg}";
    else
        # PNG is smaller
        rm -rf "${temp_file_jpeg}";
        echo "${temp_file_png}";
    fi
}

Unlike the previous functions, this one only takes a source file in. It converts it using that temporary directory we created earlier, and echos the filename of the smallest format found.

It's done in 2 stages. First, the source file is resized to 256x256 (maintaining aspect ratio, and avoiding upscaling smaller images) and written as both a JPEG and a PNG.

Then, jpegoptim and optipng are called on the resulting files. Once done, the filesizes are compared and the filepath to the smallest of the 2 is echoed.

With these in place, we can now write the glue that binds them to the xargs call by filling out process_file. Before we do though, we need to tweak the export statements from earlier to export our library functions we've written - otherwise process_file won't be able to access them since it's wrapped in bash -c '' and xargs. Here's the full list of export directives (directly below the end of process_file):

export temp_dir;
export -f process_file;
export -f compress_image;
export -f do_downmux;
export -f do_copy;
# $1    filename
process_file() {
    filename="${1}";
    extension="${filename##*.}";

    orig_destination="${output}/${filename}";
    destination="${orig_destination}";

    echo -n "[file] ${filename}: ";

    do_downmux=false;
    # Downmux, but only the bitrate is above 256k
    if [[ "${extension}" == "flac" ]] || [[ "${extension}" == "ogg" ]] || [[ "${extension}" == "mp3" ]]; then
        probejson="$(ffprobe -hide_banner -v quiet -show_format -print_format json "${filename}")";
        is_above_256k="$(echo "${probejson}" | jq --raw-output '(.format.bit_rate | tonumber) > 256000')";
        exit_code="${?}";
        if [[ "${exit_code}" -ne 0 ]]; then
            echo -n "ffprobe failed; falling back on ";
            do_downmux=false;
        elif [[ "${is_above_256k}" == "true" ]]; then
            do_downmux=true;
        fi
    fi

    if [[ "${do_downmux}" == "true" ]]; then
        echo -n "downmuxing/";
        destination="${orig_destination%.*}.mp3";
    fi

    # ....
}

We use 2 variables to keep track of the destination location here, because we may or may not successfully manage to convert any given input file to a different format with a different file extension.

We also use ffprobe (part of ffmpeg) and jq (a JSON query and manipulation tool) on audio files to detect the bitrate of input files so that we can avoid remuxing files with a bitrate lower than 256kbps. Once we're determined that, we rewrite the destination filename to include the extension .mp3.

Next, we need to deal with the images. We do this in a preprocessing step that comes next:

case "${extension}" in 
    png|jpg|jpeg|JPG|JPEG )
        compressed_image="$(compress_image "${filename}")";
        compressed_extension="${compressed_image##*.}";
        destination="${orig_destination%.*}.${compressed_extension}";
        ;;
esac

If the file is an image, we run it through the image optimiser. Then we look at the file extension of the optimised image, and alter the destination filename accordingly.

if [[ -f "${destination}" ]] || [[ -f "${orig_destination}" ]]; then
    echo "exists in destination; skipping";
    return 0;
fi

destination_dir="$(dirname "${destination}")";
if [[ ! -d "${destination_dir}" ]]; then
    mkdir -p "${destination_dir}";
fi

Next, we look to see if there's a file in the destination already. If so, then we skip out and don't continue processing the file. If not, we make sure that the parent directory exists to avoid issues later.

case "${extension}" in
    flac|mp3|ogg )
        # Use ffmpeg, but only if necessary
        if [[ "${do_downmux}" == "false" ]]; then
            do_copy "${filename}" "${orig_destination}";
        else
            echo -n "ffmpeg ";
            do_downmux "${filename}" "${destination}";
            exit_code="$?";
            if [[ "${exit_code}" -ne 0 ]]; then
                echo "failed, exit code ${exit_code}: falling back on ";
                do_copy "${filename}" "${orig_destination}";
            fi
        fi
        ;;

    png|jpg|jpeg|JPG|JPEG )
        mv "${compressed_image}" "${destination}";
        ;;

    * )
        do_copy "${filename}" "${destination}";
        ;;
esac
echo "done";

Finally, we get to the main case statement that handles the different files. If it's an audio file, we run it through do_downmux (which we implemented earlier) - but only if it would benefit us. If it's an image, we move the converted image from the temporary directory that was optimised earlier, and if we can't tell what it is, then we just copy it over directly.

That's process_file completed. Now all we're missing are a few clean-up tasks that make it more cron friendly:

echo "[ ${SECONDS} ] Setting permissions";
chown -R root:root "${output}";
chmod -R 0644 "${output}";
chmod -R ugo+X "${output}";

echo "[ ${SECONDS} ] Portable music copy update complete";

This goes at the end of the file, and it reset the permissions on the output directory to avoid issues. This ensures that everyone can read it, but only root can write to it - as any modifications should be made it to the original version, and not the portable copy.

That completes this script. By understanding how it works, hopefully you'll be able to apply it to your own specific circumstances.

For example, you could call it via cron. Edit your crontab:

sudo crontab -e

...and paste in something like this:

5 4 * * *   /absolute/path/to/script.sh

This won't work if your device isn't turned on at the time, however. In that case, there is alternative. Simply drop the script (without an extension) into /etc/cron.daily or /etc/cron.weekly and mark it executable, and anacron will run your job every day or week respectively.

Anyway, here's the complete script:

Sources and Further Reading

Website change detection with headless Firefox and ImageMagick

This wasn't the script I had in mind in the previous blog post (so you can look forward to another blog post about it), but have you ever wanted to know when a web page changes? If it does change, it's almost impossible to tell where on the page it's changed. Recently, I was thinking about the problem, and realised a few things:

  • Firefox can be operated headlessly (with --headless) to take screenshots
  • ImageMagick must be advanced enough to diff images

With this in mind, I set about implementing a script. Before we continue, here's an example diff image:

It's rather tall because of the webpage I chose, but the bits that have changed appear in red. The script I've written also generates an animated PNG showing the difference too:

Again, it's very tall because of the page I tested with, but I think it's pretty cool!

If you'd like to check the script out for yourself, you find it in the following git repository: sbrl/url-diff

For the curious, the script in question is written in Bash. It uses apcalc (available in Debian / Ubuntu based Linux distributions with sudo apt install apcalc) to crunch the numbers, and headless Firefox + Imagemagick as described above to take the screenshots and do the image processing. It should in theory work on Windows, but you'll need to jump through a number of hoops:

  • Install call url-diff.sh from [git bash]()
  • Install [ImageMagick]() and make sure the binaries are in your PATH
  • Install Firefox and make sure firefox is in your PATH
  • Explicitly set the URLDIFF_STORAGE_DIR environment variable when calling the script (do this by prefixing the command at the bottom of this post with URLDIFF_STORAGE_DIR=path/to/directory)

With my fancy new embed system, I can show you the code behind it:

(Can't see the above? Check it out in the git repository.)

I'm working on line numbers (sadly the author of highlight.js doesn't like them, so an alternative solution is required).

Anyway, the basic layout of the script is as follows:

  1. First, the settings are read in and the default values set
  2. Then, I define some utility functions.
    • The calculate_percentage_colour function is integral to the image change detection algorithm. It counts percentage of an image that is a given colour.
  3. Next, the help text is displayed if necessary
  4. The case statement that follows allows multiple subcommands to be implemented. Currently I only have a check subcommand, but you never know!
  5. Inside this case statement, the screenshots are taken and compared.
    • A new screenshot is taken with headless Firefox
    • If we don't have a screenshot stored away already, we stash the new screenshot and exit
    • If we do have a pre-existing screenshot, we continue with the comparison, starting by generating a diff image where pixels that have changed are given 1 colour, and pixels that haven't changed another
    • It's at this point that calculate_percentage_colour is called to calculate how much of the image has changed - the diff image is passed in and the changed pixels are counted
    • If more than 2% (by default) has changed, then we continue on to generate the output images
    • The first output image consists of the new screenshot with the diff image overlaid - this is generated with some ImageMagick wizardry: -compose over -composite
    • The second is an animated PNG comprised of the old and new screenshots. This is generated with ffmpeg - which supports animated PNGs
    • Finally, the old screenshot that we have stored away is replaced with the new one

It sounds more complicated than it is - hopefully my above explanation makes sense (post a comment below if you're confused about something!).

You can call the script like so:

git clone https://git.starbeamrainbowlabs.com/sbrl/url-diff.git
cd url-diff;
./url-diff.sh check URL_HERE path/to/output_diff.png path/to/output.apng

....replacing URL_HERE with the URL to check, and the paths with the places you'd like to write the output images to.

Cluster, Part 8: The Shoulders of Giants | NFS, Nomad, Docker Registry

Welcome back! It's been a bit of a while, but now I'm back with the next part of my cluster series. As a refresher, here's a list of all the parts in the series so far:

In this one, we're going to look at running our first job on our Nomad cluster! If you haven't read the previous posts in this series, you'll probably want to go back and read them now, as we're going to be building on the infrastructure we've setup and the groundwork we've laid in the previous posts in this series.

Before we get to that though, we need to sort out shared storage - as we don't know which node in the cluster tasks will be running on. In my case, I'll be setting up NFS. This is hardly the only solution to the issue though - other options include:

If you're going to choose NFS like me though, you should be warned that it's neither encrypted not authenticated. You should ensure that NFS is only run on a trusted network. If you don't have a trusted network, use the WireGuard Mesh VPN trick in part 4 of this series.

NFS: Server

Setting up a server is relatively easy. Simply install the relevant package:

sudo apt install nfs-kernel-server

....edit /etc/exports to look something like this:

/mnt/somedrive/subdirectory 10.1.2.0/24(rw,async,no_subtree_check)

/mnt/somedrive/subdirectory is the directory you'd like clients to be able to access, and 10.1.2.0/24 is the IP range that should be allowed to talk to your NFS server.

Next, open up the relevant ports in your firewall (I use UFW):

sudo ufw allow nfs

....and you're done! Pretty easy, right? Don't worry, it'll get harder later on :P

NFS: Client

The client, in theory, is relatively straightforward too. This must be done on all nodes in the cluster - except the node that's acting as the NFS server (although having the NFS server as a regular node in the cluster is probably a bad idea). First, install the relevant package:

sudo apt install nfs-common

Then, update /etc/fstab and add the following line:

10.1.2.10:/mnt/somedrive/subdirectory   /mnt/shared nfs auto,nofail,noatime,intr,tcp,bg,_netdev 0   0

Again, 10.1.2.10 is the IP of the NFS server, and /mnt/somedrive/subdirectory must match the directory exported by the server. Finally, /mnt/shared is the location that we're going to mount the directory from the NFS server to. Speaking of, we should create that directory:

sudo mkdir /mnt/shared

I have yet to properly tune the options there on both the client and the server. If I find that I have to change anything here, I'll both come back and edit this and mention it in a future post that I did.

From here, you should be able to mount the NFS share like so:

sudo mount /mnt/shared

You should see the files from the NFS server located in /mnt/shared. You should check to make sure that this auto-mounts it on boot too (that's what the auto and _netdev are supposed to do).

If you experience issues on boot (like me), you might see something like this buried in /var/log/syslog:

mount[586]: mount.nfs: Network is unreachable

....then we can quickly hack this by creating a script in the directory /etc/network/if-up.d. It should read something like this should fix the issue:

#!/usr/bin/env bash
mount /mnt/shared

Save this to /etc/network/if-up.d/cluster-shared-nfs for example, not forgetting to mark it as executable:

sudo chmod +x /etc/network/if-up.d/cluster-shared-nfs

Alternatively, there's autofs that can do this more intelligently if you prefer.

First Nomad Job: Docker Registry

Now that we've got shared storage online, it's time for the big moment. We're finally going to start our very first job on our Nomad cluster!

It's going to be a Docker registry, and in my very specific case I'm going to be marking it as insecure (gasp!) because it's only going to be accessible from the WireGuard VPN - which I figure provides the encryption and authentication for us to get started reasonably simply without jumping through too many hoops. In the future, I'll probably revisit this in a later post to tighten things up.

Tasks on a Nomad cluster take the form of a Nomad job file. These can written in JSON or HCL (Hashicorp Configuration Language). I'll be using HCL here, because it's easier to read and we're not after machine legibility yet at this stage.

Nomad job files work a little bit like Nginx config files, in that they have nested sequences of blocks in a hierarchical structure. They loosely follow the following pattern:

job > group > task

The job is the top-level block that contains everything else. tasks are the items that actually run on the cluster - e.g. a Docker container. groups are a way to logically group tasks in a job, and are not required as far as I can tell (but we'll use one here anyway just for illustrative purposes). Let's start with the job spec:

job "registry" {
    datacenters = ["dc1"]
    # The Docker registry *is* pretty important....
    priority = 80

    # If this task was a regular task, we'd use a constraint here instead & set the weight to -100
    affinity {
        attribute   = "${attr.class}"
        value       = "controller"
        weight      = 100
    }

    # .....

}

This defines a new job called registry, and it should be pretty straight forward. We don't need to worry about the datacenters definition there, because we've only got the 1 (so far?). We set a priority of 80, and get the job to prefer running on nodes with the controller class (though I observe that this hasn't actually made much of a difference to Nomad's scheduling algorithm at all).

Let's move on to the real meat of the job file: the task definition!

group "main" {
    task "registry" {
        driver = "docker"

        config {
            image = "registry:2"
            labels { group = "registry" }

            volumes = [
                "/mnt/shared/registry:/var/lib/registry"
            ]

            port_map {
                registry = 5000
            }
        }

        resources {
            network {
                port "registry" {
                    static = 5000
                }
            }
        }

        # .......
    }
}

There's quite a bit to unpack here. The task itself uses the Docker driver, which tells Nomad to run a Docker container.

In the config block, we define the Docker driver-specific settings. The docker image we're going to run is registry:2 where registry is the image name, and 2 is the tag. This will to automatically pulled from the Docker hub. Future tasks will pull docker images from our very own private Docker registry, which we're in the process of setting up :D

We also mount a directory into the Docker container to allow it to persist the images that we push to it. This is done through a volume, which is the Docker word for bind-mounting a specific directory on the host system into a given location inside the guest container. For me I'm (currently) going to store the Docker registry data at /mnt/shared/registry - you should update this if you want to store it elsewhere. Remember this this needs to be a location on your shared storage, as we don't know which node in the cluster the Docker registry is going to run on in advance.

The port_map allows us to tell Nomad the port(s) that our service inside the Docker container listens on, and attach a logical name to them. We can then expose them in the resources block. In this specific case, I'm forcing Nomad to statically allocate port 5000 on the host system to point to port 5000 inside the container, for reasons that will become apparent later. This is done with the static keyword there. If we didn't do this, Nomad would allocate a random port number (which is normally what we'd want, because then we can run lots of copies of the same thing at the same time on the same host).

The last block we need to add to complete the job spec file is the service block. with a service block, Nomad will inform Consul that a new service is running, which will then in turn allow us to query it via DNS.

service {
    name = "${TASK}"
    tags = [ "infrastructure" ]

    address_mode = "host"
    port = "registry"
    check {
        type        = "tcp"
        port        = "registry"
        interval    = "10s"
        timeout     = "3s"
    }

}

The service name here is pulled from the name of the task. We tell Consul about the port number by specifying the logical name we assigned to it earlier.

Finally, we add a health check, to allow Consul to keep an eye on the health of our Docker registry for us. This will appear as a green tick if all is well in the web interface, which we'll be getting to in a future post. The health check in question simply ensures that the Docker registry is listening via TCP on the port it should be.

Here's the completed job file:

job "registry" {
    datacenters = ["dc1"]
    # The Docker registry *is* pretty important....
    priority = 80

    # If this task was a regular task, we'd use a constraint here instead & set the weight to -100
    affinity {
        attribute   = "${attr.class}"
        value       = "controller"
        weight      = 100
    }

    group "main" {

        task "registry" {
            driver = "docker"

            config {
                image = "registry:2"
                labels { group = "registry" }

                volumes = [
                    "/mnt/shared/registry:/var/lib/registry"
                ]

                port_map {
                    registry = 5000
                }
            }

            resources {
                network {
                    port "registry" {
                        static = 5000
                    }
                }
            }

            service {
                name = "${TASK}"
                tags = [ "infrastructure" ]

                address_mode = "host"
                port = "registry"
                check {
                    type        = "tcp"
                    port        = "registry"
                    interval    = "10s"
                    timeout     = "3s"
                }

            }
        }

        // task "registry-web" {
        //  driver = "docker"
        // 
        //  config {
        //      // We're going to have to build our own - the Docker image on the Docker Hub is amd64 only :-/
        //      // See https://github.com/Joxit/docker-registry-ui
        //      image = ""
        //  }
        // }
    }
}

Save this to a file, and then run it on the cluster like so:

nomad job run path/to/job/file.nomad

I'm as of yet unsure as to whether Nomad needs the file to persist on disk to avoid it getting confused - so it's probably best to keep your job files in a permanent place on disk to avoid issues.

Give Nomad to start the job, and then you can check on it's status like so:

nomad job status

This will print a summary of the status of all jobs on the cluster. To get detailed information about our new job, do this:

nomad job status registry

It should show that 1 task is running, like this:

ID            = registry
Name          = registry
Submit Date   = 2020-04-26T01:23:37+01:00
Type          = service
Priority      = 80
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
main        0       0         1        5       6         1

Latest Deployment
ID          = ZZZZZZZZ
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
main        1        1       1        0          2020-06-17T22:03:58+01:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created   Modified
XXXXXXXX  YYYYYYYY  main        4        run      running  6d2h ago  2d23h ago

Ignore the Failed, Complete, and Lost there in my output - I ran into some snags while learning the system and setting mine up :P

You should also be able to resolve the IP of your Docker registry via DNS:

dig +short registry.service.mooncarrot.space

mooncarrot.space is the root domain I've bought for my cluster. I highly recommend you do the same if you haven't already. Consul exposes all services under the service subdomain, so in the future you should be able to resolve the IP of all your services in the same way: service_name.service.DOMAIN_ROOT.

Take care to ensure that it's showing the right IP address here. In my case, it should be the IP address of the wgoverlay network interface. If it's showing the wrong IP address, you may need to carefully check the configuration of both Nomad and Consul. Specifically, start by checking the network_interface setting in the client block of your Nomad worker nodes from part 7 of this series.

Conclusion

We're getting there, slowly but surely. Today we've setup shared storage with NFS, and started our first Nomad job. In doing so, we've started to kick the tyres of everything we've installed so far:

  • wesher, our WireGuard Mesh VPN
  • Unbound, our DNS server
  • Consul, our service discovery superglue
  • Nomad, our task scheduler

Truly, we are standing on the shoulders of giants: a whole host of open-source software that thousands of people from across the globe have collaborated together to produce which makes this all possible.

Moving forwards, we're going to be putting that Docker registry to good use. More immediately, we're going to be setting up Fabio (who's documentation is only marginally better than Traefik's, but just good enough that I could figure out how to use it....) in order to take a peek at those cool web interfaces for Nomad and Consul that I keep talking about.

We're also going to be looking at setting up Vault for secret (and certificate, if all goes well) management.

Until then, happy cluster configuration! If you're confused about anything so far, please leave a comment below. If you've got a suggestion to make it even better, please comment also! I'd love to know.

Sources and further reading

Ensuring a Linux machine's network connection stays up with Bash

Recently, I had the unpleasant experience of my Lab machine at University dropping offline. It has a tendency to do this randomly - and normally I'd just reboot it myself, but since I'm working from home at the moment it meant that I couldn't go in to fix it. This unfortunately meant that I was stuck waiting for a generous technician to go in and reboot it for me.

With access now restored I decided that I really didn't want this to happen again, so I've written a simple Bash script to resolve the issue.

It works by checking for an Internet connection every hour by pinging starbeamrainbowlabs.com - and if it doesn't manage to do so successfully, then it will reboot. A simple concept, but I discovered a number of things that needed considering while writing it:

  1. To avoid detecting transient network issues, we should make multiple attempts before giving up and rebooting
  2. Those multiple attempts need to be delayed to be effective
  3. We mustn't reboot more than once an hour to avoid getting into a 'reboot loop'
  4. If we're running an experiment, we need a way to temporarily delay it from doing it's checks that will resume automatically
  5. We could try and diagnose the network error or turn the networking of and on again, but if it gets stuck halfway through then we're locked out (very undesirable) - so it's easier / safer to just reboot

With these considerations in mind, I came up with this: ensure-network.sh (link to part of a GitHub Gist, as it's quite long)

This script requires Bash version 4+ and has a number of environment variables that can configure its behaviour:

Environment Variable Description
CHECK_EXTERNAL_HOST The domain name or IP address to ping to check the connection
CHECK_INTERVAL The interval to check the connection in seconds
CHECK_TIMEOUT Wait at most this long for a reply to our ping
CHECK_RETRIES Retry this many times before giving up and rebooting
CHECK_RETRY_DELAY Delay this many seconds in between retries
CHECK_DRY_RUN If true, then don't actually reboot (useful for testing)
CHECK_REBOOT_DELAY Leave at least this many minutes in between reboots
CHECK_POSTPONE_FILE If this file exists and has a recent last-modified time (mtime), don't actually reboot
CHECK_POSTPONE_MAXAGE The maximum age in minutes of the CHECK_POSTPONE_FILE to consider it fresh and avoid rebooting

With these environment variables, it covers all 4 points in the above list. To expand on CHECK_POSTPONE_FILE, if I'm running an experiment for example and I don't want it to reboot in the middle of said experiment, then I can simply run touch /path/to/postpone_file to delay network connection-related reboots for 7 days (by default). After this time, it will automatically start rebooting again if it drops off the network. This ensures that it will always restart monitoring eventually - as if I had a more manual system I'd forget to re-enable it and then loose access.

Another consideration is that the /var/cache directory must exist. This is because an empty tracking file is created there to keep track of when the last network connection-related reboot occurred.

With the script written, then next step is to have it run automatically on boot. For systemd-based systems such as my lab machine, a systemd service is the order of the day. This is relatively simple:

[Unit]
Description=Reboot if the network connection is down
After=network.target

[Service]
Type=simple
# Because it needs to be able to reboot
User=root
Group=root
EnvironmentFile=-/etc/default/ensure-network
ExecStartPre=/bin/sleep 60
ExecStart=/bin/bash "/usr/local/lib/ensure-network/ensure-network.sh"
SyslogIdentifier=ensure-access
StandardError=syslog
StandardOutput=syslog

[Install]
WantedBy=multi-user.target

(View the latest version in the GitHub Gist)

This assumes that the ensure-network.sh script is located at /usr/local/lib/ensure-network/ensure-network.sh. It also allows for an environment file to optionally be created at /etc/default/ensure-network, so that you can customise the parameters. Here's an example environment file:

CHECK_EXTERNAL_HOST=example.com
CHECK_INTERVAL=60

The above example environment file checks against example.com every minute instead of the default starbeamrainbowlabs.com every hour. You can, of course, specify any (or all) of the environment variables detailed above in the environment file if you wish.

That completes my setup - so hopefully I don't encounter any more network-related issues that lock me out of accessing my lab machine remotely! To install it yourself, you can do this:

# Create the directory for the script to live in
sudo mkdir /usr/local/lib/ensure-network
# Download the script & service file
sudo curl -L -O /usr/local/lib/ensure-network/ensure-network.sh https://gist.githubusercontent.com/sbrl/08e13f2ceedafe35ac7f8dbdfb8bfde7/raw/cc5ab4226472c08b09e448a257256936cc749193/ensure-network.sh
sudo curl -L -O /etc/systemd/system/ensure-network.service https://gist.githubusercontent.com/sbrl/08e13f2ceedafe35ac7f8dbdfb8bfde7/raw/adf5ed4009b3e1a09f857936fceb3581897072f4/ensure-network.service
# Start the service & enable it on boot
sudo systemctl daemon-reload
sudo systemctl start ensure-network.service
sudo systemctl enable ensure-network.service

You might need to replace the URLs there with the latest ones that download the raw content from the GitHub Gist.

Did you find this useful? Got a suggestion to make it better? Running into issues? Comment below!

Analysing logs with lnav

Before I forget about it, I want to make a note on here about lnav. It's available in the default Ubuntu repositories, and I discovered it a while back.

A screenshot of lnav at work

(Above: a screenshot of lnav. The pixellated bits are the IPs, which I've hidden for privacy.)

Essentially, it's a tool to make reading and analysing log files much easier. It highlights the interesting bits, and also allows you to filter log lines in or out with regular expressions. It even allows you to query your logs with SQLite if they are in any of the well-known formats that it can parse - and you can write your own log line parser definitions too with a JSON configuration file!

I find it a great tool to us every now and then to get an overview of my various devices that I manage to see if there are any issues I need to take care of. The error and warning message highlighting (while not perfect) is also rather useful to help in spotting the things that require my attention.

If you're on a Debian-based distribution of Linux, you should be able to install it like so:

sudo apt install lnav

Then, to analyse some log files:

lnav path/to/log/files

You can also use Bash's glob-star feature to specify multiple log files. it can also automatically unpack gzipped logfiles too:

lnav /var/log/syslog*

Of course, don't forget to prefix with sudo if you require it to read a given logfile.

Pipes, /dev/shm, or a TCP socket: Which is faster?

I've been busy patching HAIL-CAESAR (a simplified 2D flood simulation program designed for HPC supercomputers) to make it more suitable for the scale of my PhD project, and as part of this I'm trying to use the standard input & output where possible to speed up data transfer for the pre and post-processing steps, since I need to convert the data to and from different formats.

As part of this, it crossed my mind that there are actually a number of different ways of getting data in and out of a program, so I decided to do a quick (relatively informal) test to see which was fastest.

In my actual project, I'm going to be doing the following data transfers:

  • From .jsonstream.gz files to a Node.js process
  • From the Node.js process to HAIL-CAESAR
  • From HAIL-CAESAR to another Node.js process (there's a LOT of data in this bit)
  • From that Node.js process to disk as PNG files

That's a lot of transferring. In particular the output of HAIL-CAESAR, which I'm currently writing directly to disk, appears to be absolutely enormous - due mainly to the hugely inefficient storage format used.

Anyway, the 3 mechanisms I'm putting to the test here are:

  • A pipe (e.g. writing to standard output)
  • Writing to a file in /dev/shm
  • A TCP socket

If anyone can think of any other mechanisms for rapid inter-process communication, please do get in touch by leaving a comment below.

Pipe

I'm simulating a pipe with the following code:

timeout --signal=SIGINT 30s dd if=/dev/zero status=progress | cat >/dev/null

The timeout --signal=SIGINT 30s bit lets it run for 30 seconds before stopping it with a SIGINT (the same as Ctrl + C). I'm reading from /dev/zero here, because I want to test the performance of the pipe and not be limited by the speed of random number generation if I were to use /dev/urandom.

Running this on my laptop resulted in a speed of ~396 MB/s.

/dev/shm

/dev/shm is the shared memory area on Linux - and is usually backed by a tmpfs file system (i.e. an in-memory ramdisk).

Here are the command I'm using to test this:

dd if=/dev/zero of=/dev/shm/test-1gb bs=1024 count=1000000
dd if=/dev/shm/test-1gb of=/dev/null bs=1024 count=1000000

This writes a 1GB file to /dev/shm, and then reads it back again (to be consistent with the pipe test). To calculate the overall MB/s speed, we need to know the time it took to do the read and write operations. I observed the following:

Operation Speed Time
Write 692 MB/s 1.4788s
Read 890 MB/s 1.1501s

....so that's 2.6289s in total. Then, we can calculate the MB/s by dividing 1GB by the total time, giving us a total transfer speed of ~380 MB/s. This seemed quite variable though - as when I tested it the other day I got only ~273 MB/s.

TCP Socket

Finally, to test a TCP socket, I devised the following:

nc -l 8888 >/dev/null &
timeout --signal=SIGINT 30s dd status=progress if=/dev/zero | nc 127.0.0.1 8888

The first line sets up the listener, and the 2nd line is the sender. As before with the pipe test, I'm stopping it after 30 seconds. It took a moment to stabilise, but towards the end it levelled off at about ~360 MB/s.

Conclusion

After running the 3 tests, the results were as follows:

Test Speed
Pipe 396 MB/s
/dev/shm 380 MB/s
TCP Socket 360 MB/s

According to this, the pipe (i.e. writing to the standard output and reading from the standard input) is the fastest. This isn't particularly surprising (since the other methods have overhead), but interesting to test all the same. Here's a quick graph of that:

A quick bar chart of the above data

Of course, there are other considerations to take into account. For example, If you need scalable multi-core processing, then /dev/shm or TCP sockets (the latter especially since Linux has a special mechanism for multiple processes to listen on the same port and allow load-balancing between them) might be a better option - despite the additional overhead.

Other CPU architectures may have an effect on it too due to different CPU instructions being available - I ran these tests on Ubuntu 19.10 on the Intel Core i7-7500U in my laptop.

As of yet I'm unsure as to how much post-processing the data coming from HAIL-CAESAR will require - and whether it will require multiple processes to handle the load or not. I hope not - since HAIL-CAESAR is written in C++, and TCP sockets would be awkward and messy to implement (since you would probably have to use the low-level socket API, and I don't have any experience with networking in C++ yet) - and the HPC in question doesn't appear to have inotifywait installed to make listening for file writes on disk easier.

Quick File Management with Gossa

Recently a family member needed to access some documents at a remote location that didn't support USB flash drives. Awkward to be sure, but I did some searching around and found a nice little solution that I thought I'd blog about here.

At first, I thought about setting up Filestash - but I discovered that only installation through Docker is officially supported (if it's written in Go, then shouldn't it end up as a single binary? What's Docker needed for?).

Docker might be great, but for a quick solution to an awkward issue I didn't really want to go to the trouble for installing Docker and figuring out all the awkward plumbing problems for the first time. It definitely appeared to me that it's better suited to a setup where you're already using Docker.

Anyway, I then discovered Gossa. It's also written in Go, and is basically a web interface that lets you upload, download, and rename files (click on a file or directory's icon to rename).

A screenshot of Gossa listing the contents of my CrossCode music folder. CrossCode is awesome, and you should totally go and play it - after finishing reading this post of course :P

Is it basic? Yep.

Do the icons look like something from 1995? Sure.

(Is that Times New Roman I spy? I hope not)

Does it do the job? Absolutely.

For what it is, it's solved my problem fabulously - and it's so easy to setup! First, I downloaded the binary from the latest release for my CPU architecture, and put it somewhere on disk:

curl -o gossa -L https://github.com/pldubouilh/gossa/releases/download/v0.0.8/gossa-linux-arm

chmod +x gossa
sudo chown root: gossa
sudo mv gossa /usr/local/bin/gossa;

Then, I created a systemd service file to launch Gossa with the right options:

[Unit]
Description=Gossa File Manager (syncthing)
After=syslog.target rsyslog.service network.target

[Service]
Type=simple
User=gossa
Group=gossa
WorkingDirectory=/path/to/dir
ExecStart=/usr/local/bin/gossa -h [::1] -p 5700 -prefix /gossa/ /path/to/directory/to/serve
Restart=always

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=gossa


[Install]
WantedBy=multi-user.target

_(Top tip! Use systemctl cat service_name to quickly see the service file definition for any given service)_

Here I start Gossa listening on the IPv6 local loopback address on port 5700, set the prefix to /gossa/ (I'm going to be reverse-proxying it later on using a subdirectory of a pre-existing subdomain), and send the standard output & error to syslog. Speaking of which, we should tell syslog what to do with the logs we send it. I put this in /etc/rsyslog.d/gossa.conf:

if $programname == 'gossa' then /var/log/gossa/gossa.log
if $programname == 'gossa' then stop

After that, I configured log rotate by putting this into /etc/logrotate.d/gossa:

/var/log/gossa/*.log {
    daily
    missingok
    rotate 14
    compress
    delaycompress
    notifempty
    create 0640 root adm
    postrotate
        invoke-rc.d rsyslog rotate >/dev/null
    endscript
}

Very similar to the configuration I used for RhinoReminds, which I blogged about here.

Lastly, I configured Nginx on the machine I'm running this on to reverse-proxy to Gossa:

server {

    # ....

    location /gossa {
        proxy_pass http://[::1]:5700;
    }

    # ....

}

I've configured authentication elsewhere in my Nginx server block to protect my installation against unauthorised access (and oyu probably should too). All that's left to do is start Gossa and reload Nginx:

sudo systemctl daemon-reload
sudo systemctl start gossa
# Check that Gossa is running
sudo systemctl status gossa

# Test the Nginx configuration file changes before reloading it
sudo nginx -t
sudo systemctl reload

Note that reloading Nginx is more efficient that restarting it, since it doesn't kill the process - only reload the configuration from disk. It doesn't matter here, but in a production environment that receives a high volume of traffic you it's a great way make configuration changes while avoid dropping client connections.

In your web browser, you should see something like the image at the top of this post.

Found this interesting? Got another quick solution to an otherwise awkward issue? Comment below!

Ensure your SSH server is secure with SSH Check

We've got ssllabs.com for testing HTTPS servers to ensure they are setup to be secure, and personally I've been using it for years now (psst, starbeamrainbowlabs.com gets an A+!).

SSH servers are a very different story, however. While I've blogged about them before, I mainly focused on preventing unauthorised access to a server by methods such as password cracking attacks.

Now that I'm coming to the end of my Msc in Security and Distributed Computing, however, I've realised there's a crucial element missing here: the security of the connection itself. HTTPS isn't the only one with complicated cipher suites that it supports that need correctly configuring.

The solution here is to check the SSH server in the same way that we do for a HTTPS web server. For this though we need a tool to do this for us and tell us what's good and what's not about our configuration - which is where SSH Check comes in.

I discovered it recently, and it pretends to connect to an SSH server to gauge it's configuration - after which it quickly disconnects before the remote server asks it for credentials to login.

A screenshot of a test of the example ssh server

Because SSH allows for every stage of the encryption process to be configured individually, SSH Check tests 4 main areas:

  • The key exchange algorithm (the algorithm used to exchange the secret key for symmetric encryption going forwards)
  • The algorithms used in the server's host SSH keys (the key whose ID is shown to you when you connect asking you if you want to continue)
  • The encryption algorithm (the symmetrical encryption algorithm used after key exchange)
  • The MAC algorithm (the Message Authentication Code algorithm - used to ensure integrity of messages)

It displays whether each algorithm is considered safe or not, and which ones are widely considered to be either deprecated or contain backdoors. In addition, it also displays the technical names of each one so that you can easily reconfigure your SSH server to disable unsafe algorithms, which is nice (good luck deciphering the SSL Labs encryption algorithms list and matching it up to the list already configured in your web server......).

It also presents a bunch of other interesting information too, which is nice. It identified a number of potential issues with the way that I had SSH setup for starbeamrainbowlabs.com along with some suggested improvements, which I've now fixed.

If you have a server that you access via SSH, I recommend checking it with SSH Check - especially if you expose SSH publicly over the Internet.

Found this interesting? Got another testing tool you'd like to share? Comment below!

shunction: Self-hosted Azure Functions!

It's not 1st April, but if it was - this post would be the perfect thing to release! It's about shunction, a parody project I've written. While it is a parody, hopefully it is of some use to someone :P

The other week, I discovered Azure Functions - thanks to Rob Miles. In the same moment, I thought that it would be fairly trivial to build a system by which you could self-host it - and shunction was born.

Built upon the lantern build engine, shunction lets you keep a folder of executable scripts and execute them at will. It supports 4 modes of operation:

  • adhoc - One-off runs
  • cron - Regularly scheduled execution
  • inotify - Execute when something changes on disk
  • http - Listen for HTTP requests

Of course, the system can be upgraded at a later date to support additional operating modes. The last 2 in the list start a persistent and long-running process, which will only exit if terminated.

Shunction is designed to run on Linux systems. Because of the shebang, any executable script (be it a shell script, a Python script, a Node.js script, or otherwise) is supported.

If you've got multiple tasks you want have your server running and want to keep them all in one place, then shunction would allow you to do so. For example, you could keep the task scripts in a git repository that you clone down onto the server, and then use shunction to execute them as needed.

How to quickly run TUI programs via SSH

Hello, and welcome to another blog post! I hope everyone had a lovely and restful Easter.

Very often, I want to run a command on a remote machine via SSH and leave it in a terminal in 1 corner of my screen whilst I work in another terminal on that same machine.

Up until now, I've always SSHed into the machine in question and then run the command manually:

user@local:~$ ssh bob@bobsrockets.com
# .....
bob@bobsrockets.com:~$ sudo htop

This is fine, but it takes a moment to connect & setup the terminal on the remote end. What if there was a way to specify the command to run remotely?

Well, it turns out there is. SSH lets you specify the command to run on the remote server instead of the default shell:

ssh sean@seanssatellites.io apt search beanstalk

Sadly, this doesn't always yield the results expected. Colour disappears from the output, and sometimes things like htop (ssh bill@billsboosters.co.uk htop) and sudo (ssh edgar@edsengineering.eu sudo apt update) break altogether:

Error opening terminal: unknown.

I can't remember how I figured it out, but I discovered that the issue is that when you specify the command instead of letting the default shell initialise, it treats it as some sort of 'script-mode', and doesn't allocate a pseudo-terminal on the remote machine.

Thankfully, there's a way to force it to allocate a pseudo-terminal. This is done with the -t flag:

ssh -t bob@bobsrockets.com sudo htop

This then enables interactive commands to work as intended, and causes colour to be displayed again :D

Found this useful? Got another great SSH tip? Comment below!

Art by Mythdael