Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Automatically downloading emails and extracting their attachments

I have an all-in-one printer that's also a scanner - specifically the Epson Ecotank 4750 (though annoyingly the automated document feeder doesn't support duplex). While it's a great printer (very eco-friendly, and the inks last for ages!), my biggest frustration with it is that it doesn't scan directly to an SMB file share (i.e. a Windows file share). It does support SANE though, which allows you to use it through a computer.

This is ok, but the ability to scan directly from the device itself without needing to use a computer was very convenient, so I set out to remedy this. The printer does have a cloud feature they call "Epson Connect", which allows one to upload to various cloud services such as Google Drive and Box, but I don't want to upload potentially sensitive data to such services.

Fortunately, there's a solution at hand - email! The printer in question also supports scanning to a an email address. Once the scanning process is complete, then it sends an email to the preconfigured email address with the scanned page(s) attached. It's been far too long since my last post about email too, so let's do something about that.

Logging in to my email account just to pick up a scan is clunky and annoying though, so I decided to automate the process to resolve the issue. The plan is as follows:

  1. Obtain a fresh email address
  2. Use IMAP IDLE to instantly download emails
  3. Extract attachments and save them to the output directory
  4. Discard the email - both locally and remotely

As some readers may be aware, I run my own email server - hence the reason why I wrote this post about email previously, so I reconfigured it to add a new email address. Many other free providers exist out there too - just make sure you don't use an account you might want to use for anything else, since our script will eat any emails sent to it.

Steps 2, 3, and 4 there took some research and fiddling about, but in the end I cooked up a shell script solution that uses fetchmail, procmail (which is apparently unmaintained, so I should consider looking for alternatives), inotifywait, and munpack. I've also packaged it into a Docker container, which I'll talk about later in this post.

To illustrate how all of these fit together, let's use a diagram:

A diagram showing how the whole process fits together - explanation below.

fetchmail uses IMAP IDLE to hold a connection open to the email server. When it receives notification of a new email, it instantly downloads it and spawns a new instance of procmail to handle it.

procmail writes the email to a temporary directory structure, which a separate script is watching with inotifywait. As soon as procmail finishes writing the new email to disk, inotifywait triggers and the email is unpacked with munpack. Any attachments found are moved to the output directory, and the original email discarded.

With this in mind, let's start drafting up a script. The first order of the day is configuring fetchmail. This is done using a .fetchmailrc file - I came up with this:

poll protocol IMAP port 993
    user "" with pass "PASSWORD_HERE"

...where is the email address you want to watch, is the domain part of said email address (everything after the @), and PASSWORD_HERE is the password required to login.

Save this somewhere safe with tight file permissions for later.

The other configuration file we'll need is one for procmail. let's do that one now:



Replace /tmp/maildir with the temporary directory you want to use to hold emails in. Save this as procmail.conf for later too.

Now we have the mail config files written, we need to install some software. I'm using apt on Debian (a minideb Docker container actually), so you'll need to adapt this for your own system if required.

sudo apt install ca-certificates fetchmail procmail inotify-tools mpack
# or, if you're using minideb:
install_packages ca-certificates fetchmail procmail inotify-tools mpack

fetchmail is for some strange reason extremely picky about the user account it runs under, so let's update the pre-created fetchmail user account to make it happy:

groupadd --gid 10000 fetchmail
usermod --uid 10000 --gid 10000 --home=/srv/fetchmail --uid=10000 --gi=10000 fetchmail
chown fetchmail:fetchmail /srv/fetchmail

fetchmail now needs that config file we created earlier. Let's update the permissions on that:

chmod 10000:10000 path/to/.fetchmailrc

If you're running on bare metal, move it to the /srv/fetchmail directory now. If you're using Docker, keep reading, as I recommend that this file is mounted using a Docker volume to make the resulting container image more reusable.

Now let's start drafting a shell script to pull everything together. Let's start with some initial setup:

#!/usr/bin/env bash

if [[ -z "${TARGET_UID}" ]]; then
    echo "Error: The TARGET_UID environment variable was not specified.";
    exit 1;
if [[ -z "${TARGET_GID}" ]]; then
    echo "Error: The TARGET_GID environment variable was not specified.";
    exit 1;
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This Docker container must run as root because fetchmail is a pain, and to allow customisation of the target UID/GID (although all possible actions are run as non-root users)";
    exit 1;


fetchmail_uid="$(id -u "fetchmail")";
fetchmail_gid="$(id -g "fetchmail")";

temp_dir="$(mktemp --tmpdir -d "imap-download-XXXXXXX")";
on_exit() {
    rm -rf "${temp_dir}";
trap on_exit EXIT;

log_msg() {
    echo "$(date -u +"%Y-%m-%d %H:%M:%S") imap-download: $*";

This script will run as root, and fetchmail runs as UID 10000 and GID 10000, The reasons for this are complicated (and mostly have to do with my weird network setup). We look for the TARGET_UID and TARGET_GID environment variables, as these define the uid:gid we'll be setting files to before writing them to the output directory.

We also determine the fetchmail UID/GID dynamically here, and create a second temporary directory to work with too (the reasons for which will become apparent).

Before we continue, we need to create the directory procmail writes new emails to. Not because procmail won't create it on its own (because it will), but because we need it to exist up-front so we can watch it with inotifywait:

mkdir -p "${dir_newmail}";
chown -R "${fetchmail_uid}:${fetchmail_gid}" "${dir_mail_root}";

We're running as root, but we'll want to spawn fetchmail (and other things) as non-root users. Technically, I don't think you're supposed to use sudo in non-interactive scripts, and it's also not present in my Docker container image. The alternative is the setpriv command, but using it is rather complicated and annoying.

It's more powerful than sudo, as it allows you to specify not only the UID/GID a process runs as, but also the capabilities the process will have too (e.g. binding to low port numbers). There's a nasty bug one has to work around if one is using Docker too, so given all this I've written a wrapper function that abstracts all of this complexity away:

# Runs a process as another user.
# Ref
# $1    The UID to run the process as.
# $2    The GID to run the process as.
# $3-*  The command (including arguments) to run
run_as_user() {
    run_as_uid="${1}"; shift;
    run_as_gid="${1}"; shift;
    if [[ -z "${run_as_uid}" ]]; then
        echo "run_as_user: No target UID specified.";
        return 1;
    if [[ -z "${run_as_gid}" ]]; then
        echo "run_as_user: No target GID specified.";
        return 2;

    # Ref
    # WORKAROUND for `setpriv: libcap-ng is too old for "all" caps`, previously "-all" was used here
    # create a list to drop all capabilities supported by current kernel
    caps="$cap_prefix$(seq -s ",$cap_prefix" 0 "$(cat /proc/sys/kernel/cap_last_cap)")";

    setpriv --inh-caps="${caps}" --reuid "${run_as_uid}" --clear-groups --regid "${run_as_gid}" "$@";
    return "$?";

With this in hand, we can now wrap fetchmail and procmail in a function too:

do_fetchmail() {
    log_msg "Starting fetchmail";

    while :; do
        run_as_user "${fetchmail_uid}" "${fetchmail_gid}" fetchmail --mda "/usr/bin/procmail -m /srv/procmail.conf";

        if [[ "$exit_code" -eq 127 ]]; then
            log_msg "setpriv failed, exiting with code 127";
            exit 127;

        log_msg "Fetchmail exited with code ${exit_code}, sleeping 60 seconds";
        sleep 60

In short this spawns fetchmail as the fetchmail user we configured above, and also restarts it if it dies. If setpriv fails, it returns an exit code of 127 - so we catch that and don't bother trying again, as the issue likely needs manual intervention.

To finish the script, we now need to setup that inotifywait loop I mentioned earlier. Let's setup a shell function for that:

do_attachments() {
    while :; do # : = infinite loop
        # Wait for an update
        # inotifywait's non-0 exit code forces an exit for some reason :-/
        inotifywait -qr --event create --format '%:e %f' "${dir_newmail}";

        # Process new email here

Processing new emails is not particularly difficult, but requires a sub loop because:

  • More than 1 email could be written at a time
  • Additional emails could slip through when we're processing the last one
while read -r filename; do

    # Process each email

done < <(find "${dir_newmail}" -type f);

Finally, we need to process each email we find in turn. Let's outline the steps we need to take:

  1. Move the email to that second temporary directory we created above (since the procmail directory might not be empty)
  2. Unpack the attachments
  3. chown the attach

Let's do this in chunks. First, let's move it to the temporary directory:

log_msg "Processing email ${filename}";

# Move the email to a temporary directory for processing
mv "${filename}" "${temp_dir}";

The filename environment variable there is the absolute path to the email in question, since we used find and passed it an absolute directory to list the contents of (as opposed to a relative path).

To find the filepath we moved it to, we need to do this:

filepath_temp="${temp_dir}/$(basename "${filename}")"

This is important for the next step, where we unpack it:

# Unpack the attachments
munpack -C "${temp_dir}" "${filepath_temp}";

Now that we've unpacked it, let's do a bit of cleaning up, by deleting the original email file and the .desc description files that munpack also generates:

# Delete the original email file and any description files
rm "${filepath_temp}";
find "${temp_dir}" -iname '*.desc' -delete;

Great! Now we have the attachments sorted, now all we need to do is chown them to the target UID/GID and move them to the right place.

chown -R "${TARGET_UID}:${TARGET_GID}" "${temp_dir}";
chmod -R a=rX,ug+w "${temp_dir}";

I also chmod the temporary directory too to make sure that the permissions are correct, because otherwise the mv command is unable to read the directory's contents.

Now to actually move all the attachments:

# Move the attachment files to the output directory
while read -r attachment; do
    log_msg "Extracted attachment ${attachment}";
    chmod 0775 "${attachment}";
    run_as_user "${TARGET_UID}" "${TARGET_GID}" mv "${attachment}" "${target_dir}";
done < <(find "${temp_dir}" -type f);

This is rather overcomplicated because of an older design, but it does the job just fine.

With that done, we've finished the script. I'll include the whole script at the bottom of this post.


If you're running on bare metal, then you can skip to the end of this post. Because I have a cluster, I want to be able to run this thereon. Since said cluster works with Docker containers, it's natural to Dockerise this process.

The Dockerfile for all this is surprisingly concise:

(Can't see the above? View it on my personal Git server instead)

To use this, you'll need the following files alongside it:

It exposes the following Docker volumes:

  • /mnt/fetchmailrc: The fetchmailrc file
  • /mnt/output: The target output directory

All these files can be found in this directory on my personal Git server.


We've strung together a bunch of different programs to automatically download emails and extract their attachments. This is very useful as for ingesting all sorts of different files. Things I haven't covered:

  • Restricting it to certain source email addresses to handle spam
  • Restricting the file types accepted (the file command is probably your friend)
  • Disallowing large files (most 3rd party email servers do this automatically, but in my case I don't have a limit that I know of other than my hard disk space)

As always, this blog post is both a reference for my own use and a starting point for you if you'd like to do this for yourself.

If you've found this useful, please comment below! I find it really inspiring / motivating to learn how people have found my posts useful and what for.

Sources and further reading script

(Can't see the above? Try a this link, or alternatively this one (bash))

A much easier way to install custom versions of Python

Recently, I wrote a rather extensive blog post about compiling Python from source: Installing Python, Keras, and Tensorflow from source.

Since then, I've learnt of multiple other different ways to do that which are much easier as it turns out to achieve that goal.

For context, the purpose of running a specific version of Python in the first place was because on my University's High-Performance Computer (HPC) Viper, it doesn't have a version of Python new enough to run the latest version of Tensorflow.

Using miniconda

After contacting the Viper team at the suggestion of my supervisor, I discovered that they already had a mechanism in place for specifying which version of Python to use. It seems obvious in hindsight - since they are sure to have been asked about this before, they already had a solution in the form of miniconda.

If you're lucky enough to have access to Viper, then you can load miniconda like so:

module load python/anaconda/4.6/miniconda/3.7

If you don't have access to Viper, then worry not. I've got other methods in store which might be better suited to your environment in later sections.

Once loaded, you can specify a version of Python like so:

conda create -n py38 python=3.8

The -n py38 specifies the name of the environment you'd like to create, and can be anything you like. Perhaps you could use the name of the project you're working on would be a good idea. The python=3.8 is the version of Python you want to use. You can list the versions of Python available like so:

conda search -f python

Then, to activate the new environment, do this:

conda init bash
conda activate py38
exec bash

Replace py38 with the name of the environment you created above.

Now, you should have the specific version of Python you wanted installed and ready to use. You can also install packages with pip, and it should all come out in the wash.

For Viper users, further information about miniconda can be found here: Applications/Miniconda Last

Gentoo Project Prefix

Another option I've been made aware of is Gentoo's Project Prefix. Essentially, it installs Gentoo (a distribution of Linux) inside a directory without root privileges. It doesn't work very well on Ubuntu, however due to this bug, but it should work on other systems.

They provide a bootstrap script that you can run that helps you bootstrap the system. It asks you a few questions, and then gets to work compiling everything required (since Gentoo is a distribution that compiles everything from source).

If you have multiple versions of gcc available, try telling it about a slightly older version of GCC if it fails to install.

If you can get it to install, a Gentoo Prefix install allows the installation whatever software you like!


The last solution to the problem I'm aware of is pyenv. It automates the process of downloading and compiling specified versions of Python, and also updates you shell automatically. It does require some additional dependencies to be installed though, which could be somewhat awkward if you don't have sudo access to your system. I haven't actually tried it myself, but it may be worth looking into if the other 2 options don't work for you.


There's always more than 1 way to do something, and it's always worth asking if there's a better way if the way you're currently using seems hugely complicated.

Installing Python, Keras, and Tensorflow from source

I found myself in the interesting position recently of needing to compile Python from source. The reasoning behind this is complicated, but it boils down to a need to use Python with Tensorflow / Keras for some natural language processing AI, as Tensorflow.js isn't going to cut it for the next stage of my PhD.

The target upon which I'm aiming to be running things currently is Viper, my University's high-performance computer (HPC). Unfortunately, the version of Python on said HPC is rather old, which necessitated obtaining a later version. Since I obviously don't have sudo permissions on Viper, I couldn't use the default system package manager. Incredibly, pre-compiled Python binaries are not distributed for Linux either, which meant that I ended up compiling from source.

I am going to be assuming that you have a directory at $HOME/software in which we will be working. In there, there should be a number of subdirectories:

  • bin: For binaries, already added to your PATH
  • lib: For library files - we'll be configuring this correctly in this guide
  • repos: For git repositories we clone

Make sure you have your snacks - this was a long ride to figure out and write - and it's an equally long ride to follow. I recommend reading this all the way through before actually executing anything to get an overall idea as to the process you'll be following and the assumptions I've made to keep this post a reasonable length.

Setting up

Before we begin, we need some dependencies:

  • gcc - The compiler
  • git - For checking out the cpython git repository
  • readline - An optional dependency of cpython (presumably for the REPL)

On Viper, we can load these like so:

module load utilities/multi
module load gcc/10.2.0
module load readline/7.0

Compiling openssl

We also need to clone the openssl git repo and build it from source:

cd ~/software/repos
git clone git://;    # Clone the git repo
cd openssl;                                     # cd into it
git checkout OpenSSL_1_1_1-stable;              # Checkout the latest stable branch (do git branch -a to list all branches; Python will complain at you during build if you choose the wrong one and tell you what versions it supports)
./config;                                       # Configure openssl ready for compilation
make -j "$(nproc)"                              # Build openssl

With openssl compiled, we need to copy the resulting binaries to our ~/software/lib directory:

cp lib*.so* ~/software/lib;
# We're done, cd back to the parent directory
cd ..;

To finish up openssl, we need to update some environment variables to let the C++ compiler and linker know about it, but we'll talk about those after dealing with another dependency that Python requires.

Compiling libffi

libffi is another dependency of Python that's needed if you want to use Tensorflow. To start, go to the libgffi GitHub releases page in your web browser, and copy the URL for the latest release file. It should look something like this:

Then, download it to the target system:

cd ~/software/lib

Note that we do it this way, because otherwise we'd have to run the script which requires yet more dependencies that you're unlikely to have installed.

Then extract it and delete the tar.gz file:

tar -xzf libffi-3.3.tar.gz
rm libffi-3.3.tar.gz

Now, we can configure and compile it:

./configure --prefix=$HOME/software
make -j "$(nproc)"

Before we install it, we need to create a quick alias:

cd ~/software;
ln -s lib lib64;
cd -;

libffi for some reason likes to install to the lib64 directory, rather than our pre-existing lib directory, so creating an alias makes it so that it installs to the right place.

Updating the environment

Now that we've dealt with the dependencies, we now need to update our environment so that the compiler knows where to find them. Do that like so:

export LDFLAGS="-L$HOME/software/lib -L$HOME/software/include $LDFLAGS";
export CPPFLAGS="-I$HOME/software/include -I$HOME/software/repos/openssl/include -I$HOME/software/repos/openssl/include/openssl $CPPFLAGS"

It is also advisable to update your ~/.bashrc with these settings, as you may need to come back and recompile a different version of Python in the future.

Personally, I have a file at ~/software/ which I run with source $HOME/software/ in my ~/.bashrc file to keep things neat and tidy.

Compiling Python

Now that we have openssl and libffi compiled, we can turn our attention to Python. First, clone the cpython git repo:

git clone
cd cpython;

Then, checkout the latest tag. This essentially checks out the latest stable release:

git checkout "$(git tag | grep -ivP '[ab]|rc' | tail -n1)"

Important: If you're intention is to use tensorflow, check the Tensorflow Install page for supported Python versions. It's probable that it doesn't yet support the latest version of Python, so you might need to checkout a different tag here. For some reason, Python is really bad at propagating new versions out to the community quickly.

Before we can start the compilation process, we need to configure it. We're going for performance, so execute the configure script like so:

./configure --with-lto --enable-optimizations --with-openssl=/absolute/path/to/openssl_repo_dir

Replace /absolute/path/to/openssl_repo with the absolute path to the above openssl repo.

Now, we're ready to compile Python. Do that like so:

make -j "$(nproc)"

This will take a while, but once it's done it should have built Python successfully. For a sanity check, we can also test it like so:

make -j "$(nproc)" test

The Python binary compiled should be called simply python, and be located in the root of the git repository. Now that we've compiled it, we need to make a few tweaks to ensure that our shell uses our newly compiled version by default and not the older version from the host system. Personally, I keep my ~/bin folder under version control, so I install host-specific to ~/software, and put ~/software/bin in my PATH like so:

export PATH=$HOME/software/bin

With this in mind, we need to create some symbolic links in ~/software/bin that point to our new Python installation:

cd $HOME/software/bin;
ln -s relative/path/to/python_binary python
ln -s relative/path/to/python_binary python3
ln -s relative/path/to/python_binary python3.9

Replace relative/path/to/python_binary with the relative path tot he Python binary we compiled above.

To finish up the Python installation, we need to get pip up and running, the Python package manager. We can do this using the inbuilt ensurepip module, which can bootstrap a pip installation for us:

python -m ensurepip --user

This bootstraps pip into our local user directory. This is probably what you want, since if you try and install directly the shebang incorrectly points to the system's version of Python, which doesn't exist.

Then, update your ~/.bash_aliases and add the following:

export LD_LIBRARY_PATH=/absolute/path/to/openssl_repo_dir/lib:$LD_LIBRARY_PATH;
alias pip='python -m pip'
alias pip3='python -m pip'

...replacing /absolute/path/to/openssl_repo_dir with the path to the openssl git repo we cloned earlier.

The next stage is to use virtualenv to locally install our Python packages that we want to use for our project. This is good practice, because it keeps our dependencies locally installed to a single project, so they don't clash with different versions in other projects.

Before we can use virtualenv though, we have to install it:

pip install virtualenv

Unfortunately, Python / pip is not very clever at detecting the actual Python installation location, so in order to actually use virtualenv, we have to use a wrapper script - because the [shebang]() in the main ~/.local/bin/virtualenv entrypoint does not use /usr/bin/env to auto-detect the python binary location. Save the following to ~/software/bin (or any other location that's in your PATH ahead of ~/.local/bin):

#!/usr/bin/env bash

exec python ~/.local/bin/virtualenv "$@"

For example:

# Write the script to disk
nano ~/software/bin/virtualenv;
# chmod it to make it executable
chmod +x ~/software/bin/virtualenv

Installing Keras and tensorflow-gpu

With all that out of the way, we can finally use virtualenv to install Keras and tensorflow-gpu. Let's create a new directory and create a virtual environment to install our packages in:

mkdir tensorflow-test
cd tensorflow-test;
virtualenv "$PWD";
source bin/activate;

Now, we can install Tensorflow & Keras:

pip install tensorflow-gpu

It's worth noting here that Keras is a dependency of Tensorflow.

Tensorflow has a number of alternate package names you might want to install instead depending on your situation:

  • tensorflow: Stable tensorflow without GPU support - i.e. it runs on the CPU instead.
  • tf-nightly-gpu: Nightly tensorflow for the GPU. Useful if your version of Python is newer than the version of Python supported by Tensorflow

Once you're done in the virtual environment, exit it like this:


Phew, that was a huge amount of work! Hopefully this sheds some light on the maddenly complicated process of compiling Python from source. If you run into issues, you're welcome to comment below and I'll try to help you out - but you might be better off asking the Python community instead, as they've likely got more experience with Python than I have.

Sources and further reading

Running multiple local versions of CUDA on Ubuntu without sudo privileges

I've been playing around with Tensorflow.js for my PhD (see my PhD Update blog post series), and I had some ideas that I wanted to test out on my own that aren't really related to my PhD. In particular, I've found this blog post to be rather inspiring - where the author sets up a character-based recurrent neural network to generate text.

The idea of transcoding all those characters to numerical values and back seems like too much work and too complicated just for a quick personal project though, so my plan is to try and develop a byte-based network instead, in the hopes that I can not only teach it to generate text as in the blog post, but valid Unicode as well.

Obviously, I can't really use the University's resources ethically for this (as it's got nothing to do with my University work) - so since I got a new laptop recently with an Nvidia GeForce RTX 2060, I thought I'd try and use it for some machine learning instead.

The problem here is that Tensorflow.js requires only CUDA 10.0, but since I'm running Ubuntu 20.10 with all the latest patches installed, I have CUDA 11.1. A quick search of the apt repositories on my system reveals nothing that suggests I can install older versions of CUDA alongside the newer one, so I had to devise another plan.

I discovered some months ago (while working with Viper - my University's HPC - for my PhD) that you can actually extract - without sudo privileges - the contents of the CUDA .run installers. By then fiddling with your PATH and LD_LIBRARY_PATH environment variables, you can get any program you run to look for the CUDA libraries elsewhere instead of loading the default system libraries.

Since this is the second time I've done this, I thought I'd document the process for future reference.

First, you need to download the appropriate .run installer for the CUDA libraries. In my case I need CUDA 10.0, so I've downloaded mine from here:

Next, we need to create a new subdirectory and extract the .run file into it. Do that like so:

cd path/to/runfile_directory;
mkdir cuda-10.0
./ --extract=${PWD}/cuda-10.0/

Make sure that the current working directory contains no spaces, no preferably no other special characters either. Also, adjust the file and directory names to suit your situation.

Once done, this will have extract 3 subfiles - which also have the suffix .run. We're only interested in CUDA itself, so we only need to extract the the one that starts with cuda-linux. Do that like so (adjusting file/directory names as before):

cd cuda-10.0;
./ -noprompt -prefix=$PWD/cuda;
rm *.run;
mv cuda/* .;
rmdir cuda;

If you run ./ --help, it's actually somewhat deceptive - since there's a typo in the help text! I corrected it for this above though. Once done, this should leave the current working directory containing the CUDA libraries - that is a subdirectory next to the original .run file:

+ /path/to/some_directory/
    + ./
    + cuda-10.0/
        + version.txt
        + bin/
        + doc/
        + extras/
        + ......

Now, it's just a case of fiddling with some environment variables and launching your program of choice. You can set the appropriate environment variables like this:

export PATH="/absolute/path/to/cuda-10.0/bin:${PATH}";
if [[ ! -z "${LD_LIBRARY_PATH}" ]]; then
    export LD_LIBRARY_PATH="/absolute/path/to/cuda-10.0/lib64:${LD_LIBRARY_PATH}";
    export LD_LIBRARY_PATH="/absolute/path/to/cuda-10.0/lib64";

You could save this to a shell script (putting #!/usr/bin/env bash before it as the first line, and then running chmod +x path/to/, and then execute it in the context of the current shell for example like so:

source path/to/

Many deep learning applications that use CUDA also use CuDNN, a deep learning library provided by Nvidia for accelerating deep learning applications. The archived versions of CuDNN can be found here:

When downloading (you need an Nvidia developer account, but thankfully this is free), pay attention to the version being requested in the error messages generated by your application. Also take care to download the version of CUDA you're using, and match the CuDNN version appropriately.

When you download, select the "cuDNN Library for Linux" option. This will give you a tarball, which contains a single directory cuda. Extract the contents of this directory over the top of your CUDA directory from following my instructions above, and it should work as intended. I used my graphical archive manager for this purpose.

Avoiding accidental array mutation when iterating arrays in PHP

Pepperminty Wiki is written in PHP, and I've posted before about the search engine I've implemented for it that's powered by an inverted index. In this post, I want to talk about an anti-feature of PHP that doesn't behave the way you'd expect, and how to avoid running into the same problem I did.

To do this, let's introduce a simple example of the problem at work:

$arr = [];
for($i = 0; $i < 3; $i++) {
    $key = random_int(0, 2000);
    $arr[$key] = $i;
    echo("[init] key: $key, i: $i\n");

foreach($arr as $key => &$value) {
    // noop

echo("structure before: "); var_dump($arr);

foreach($arr as $key => $value) {
    echo("key: $key, i was $value\n");

echo("structure after: "); var_dump($arr);

The above code initialises an associative array with 3 elements. The contents might look like this:

Key Value
469 0
1777 1
1685 2

Pretty simple so far. It then iterates over it twice: once referring to the values by reference (that's what the & there is for), and the second time referring to the items by value.

You'd expect the array to be identical before and after the second foreach loop, but you'd be wrong:

Key Value
469 0
1777 1
1685 1

Wait, what? That's very odd. What's going on here? How can a foreach loop that's iterating an array by value mutate an array? To understand why, let's take a step back for a moment. Here's another snippet:


$arr = [ 1, 2, 3 ];

foreach($arr as $key => $value) {
    echo("$key: $value\n");

echo("The last value was $key: $value\n");

What do you expect to happen here? While in Javascript with a for..of loop with a let declaration both $key and $value would have fallen out of scope by now, in PHP foreach statements don't create a new scope for variables. Instead, they inherit the scope from their parent - e.g. the global scope in the above or their containing function if defined inside a function.

To this end, we can still access the values of both $key and $value in the above example even after the foreach loop has exited! Unexpected.

It gets better. Try prefixing $value with an ampersand & in the above example and re-running it - note that both $key and $value are both still defined.

This leads us to why the unexpected behaviour occurs. For some reason because of the way that PHP's foreach loop is implemented, if we re-use the same variable name for $value here in a subsequent loop it replaces the value of the last item in the array.

Shockingly enough this is actually documented behaviour (see also this bug report), though I'm somewhat confused as to how it happens on the last element in the array instead of the first.

With this in mind, to avoid this problem in future if you iterate an array by reference with a foreach loop, always remember to unset() the $value, like so:

$arr = [];
for($i = 0; $i < 3; $i++) {
    $key = random_int(0, 2000);
    $arr[$key] = $i;
    echo("[init] key: $key, i: $i\n");

foreach($arr as $key => &$value) {
    // noop
unset($key); unset($value);

echo("structure before: "); var_dump($arr);

foreach($arr as $key => $value) {
    echo("key: $key, i was $value\n");

echo("structure after: "); var_dump($arr);

By doing this, you can ensure that you don't accidentally mutate your arrays and spend weeks searching for the bug like I did.

It's language features like these that catch developers out: and being aware of the hows and whys of their occurrence can help you to avoid them next time (if anyone can explain why it's the last element in the array that's affected instead of the first, I'd love to know!).

Regardless, although I'm aware of how challenging implementing a programming language is, programming language designers should take care to avoid unexpected behaviour like this that developers don't expect.

Found this interesting? Comment below!

Sources and further reading

PhD Aside: Reading a file descriptor line-by-line from multiple Node.js processes

Phew, that's a bit of a mouthful. We're taking a short break from the cluster series of posts (though those will be back next week I hope), because I've just run into a fascinating problem, the solution to which I thought I'd share here - since I didn't find a solution elsewhere on the web.

For my PhD, I've got a big old lump of data, and it all needs preprocessing before I train an AI model (or a variant thereof, since I'm effectively doing video-to-image translation). Unfortunately, one of the preprocessing steps is really slow. And because I'll naturally be training my AI for multiple epochs, the problem is multiplied.....

The solution, of course, is to do all the preprocessing up front such that I can just read the data in and push it directly into a Tensor in the right format. However, doing this on such a large dataset would take forever if I did the items 1 by 1. The thing is that Javascript isn't inherently multithreaded. I like this quote, as it describes the situation rather well:

In Javascript everything runs in parallel... except your code

--Felix Geisendörfer

In other words, when Node.js is reading or writing to and from the network, disk, or other places it can do lots of things at the same time because it does them asynchronously. The Javascript that gets executed though is only done on a single thread though.

This is great for io-bound tasks (such as a web server), as Node.js (a Javascript runtime) can handle many requests at the same time. On a side note, this is also the reason why Nginx is more efficient than Apache (because Nginx is event based too like Javascript, unlike Apache which is thread based).

It's not so great though for CPU bound tasks, such as the one I've got on my hands. All is not lost though, because Node.js has a number of useful functions inbuilt that we can use to tackle the issue.

Firstly, Node.js has a clever forking system. By using child_process.fork(), a single Node.js process can create multiple copies of itself to act as workers:

// main.js
import child_process from 'child_process';
import os from 'os';

let workers = [];

for(let i = 0; i &lt; os.cpus().length; i++) {
// worker.js
console.log(`Hello, world from a child process!`);

Very useful! The next much more sticky problem though is how to actually preprocess the data in a performant manner. In my specific case, I'm piping the data in from a shell script that decompresses a number of gzip archives in a specific order (as of the time of typing I have yet to implement this).

Because this is a single pipe we're talking about here, the question now arises of how to allow all the child processes to access the data that's coming in from the standard input of the master process.

I've actually encountered an issue like this one before. I initially tried reading it in on the master process, and then using worker.send(message) to send it to the worker processes for processing. This didn't end up working very well, because the master process became a bottleneck as it couldn't read from the standard input and send stuff to the workers fast enough.

With this in mind, I came up with a new plan. In Node.js, when you're forking to create a worker process, you can supply it with some custom file descriptors upon initialisation. So long as it has at least IPC (inter-process communication) channel for passing messages back and forth with the .send() and .on("message", (message) => ....) method and listeners, it doesn't actually care what you do with the others.

Cue file descriptor cloning:

// main.js
import child_process from 'child_process';
import os from 'os';

let workers = [];

for(let i = 0; i 

I've highlighted the key line here (line 10 for those who can't see it). Here we tell it to clone file descriptors 0, 1, and 2 - which refer to stdin, stdout, and stderr respectively. This allows the worker processes direct access to the master process' stdin, stdout, and stderr.

With this, we can read from the same pipe with as many worker processes as we like - so long as they do so 1 at a time.

With this sorted, it gives rise to the next issue: reading line-by-line. Packages exist on npm (such as nexline, my personal favourite) to read from a stream line-by-line, but they have the unfortunate side-effect of maintaining a read buffer. While this is great for performance, it's not so great in my situation because it ends up scrambling the input! This is because said read buffer would be local to each worker process, so when the next worker along reads, it will skip a random number of bytes and start reading from the next bit along.

This means that I need to implement a custom method that reads a single line from a given file descriptor without maintaining a read buffer. I came up with this:

import fs from 'fs';

//  .....

// Global buffer to avoid unnecessary memory churn
let buffer = Buffer.alloc(4096);
function read_line_unbuffered(fd) {
    let i = 0;
    while(true) {
        let bytes_read = fs.readSync(fd, buffer, i, 1);
        if(bytes_read !== 1 || buffer[i] == 0x0A) {
            if(i == 0 && bytes_read == null) return null;
            return buffer.toString("utf-8", 0, i); // This is not inclusive, so we can abuse it to trim the \n off the end

        if(i == buffer.length) {
            let new_buffer = new Buffer(Math.ceil(buffer.length * 1.5));
            buffer = new_buffer;

I read from the given file descriptor character by character directly into a buffer. As soon as it detects a new line character (\n, or character code 0x0A), it returns the new line. If we run out of space in the buffer, then we create a new larger one, copy the old buffer's contents into it, and keep going.

I maintain a global buffer here, because this helps to avoid unnecessary memory churn. In my case, the lines I'm reading in a rather long (hence the need to clone the file descriptor in the first place), and if I didn't keep a shared buffer I'd be allocating and deallocating a new pretty large buffer every time.

This also has the nice side-effect that we keep the largest buffer we've had to use so far around for next time, avoiding the need for subsequent copies to larger and larger buffers.

Finally, we can also guarantee that it won't be a problem if we call this multiple times, because as I explained above Javascript is single-threaded, so if we call the function multiple times in quick succession each read will happen 1 after another.

With this chain of Node.js features, we can read a large amount of data from and efficiently process the content of a pipe. The trick from here is to implement a proper messaging and locking system to avoid reading from the stream at the same time, and avoid write to the standard output at the same time.

Taking this further, I ended up with this:

(Licence: Mozilla Public Licence 2.0)

This correctly ensures that only 1 worker process reads from the stream at the same time. It doesn't do anything with the result though except log a message to the console, but when I implement that I'll implement a similar messaging system to ensure that only 1 process writes to the output at once.

On that note, my data is also ordered, so I'll have to implement a complicated cache system // ordering system to ensure that I write them to the standard output in the same order I read them in. When I do implement that, I'll probably blog about that too....

The main problem I still have with this solution is that I'm reading from the input stream. I haven't done any proper testing, but I'm pretty sure that doing so will be really slow. I not sure I can avoid this though and read a few KiBs at a time, because I don't currently know of any way to put the extra characters back into the input stream.

If anyone has a solution to that that increases performance, I'd love to know. Leave a comment below!

Exporting an SQLite3 database to a directory of CSV files

Recently I was working with a dataset I acquired for my PhD, and to pre-process said dataset into something more sensible I imported it into an SQLite3 database. Once I was finished processing it, I then needed to export it again into regular CSV files so that I could do other things, such as plot it with GNUPlot, or import it into InfluxDB (more on InfluxDB in a later post).

With the help of Stack Overflow and the SQLite3 man page, this didn't prove to be too difficult. To export a single SQLite3 table to a CSV file, you do this:

sqlite3 -bail -header -csv "bobsrockets.sqlite3" "SELECT * FROM 'table_name';" >"path/to/output_file.csv";

This is great for a single table, but what if we want to export all the tables? Well, we can iterate over all the tables in an SQLite3 database like so:

while read table_name; do
    echo "Exporting ${table_name}";

    # Do stuff
done < <(sqlite3 "bobsrockets.sqlite3" ".tables");

If we combine this with the previous snippet, we can export all the tables like so:

while read table_name; do
    log "Exporting ${table_name}";

    sqlite3 -bail -header -csv "bobsrockets.sqlite3" "SELECT * FROM '${table_name}';" >"${table_name}.csv"; 
done < <(sqlite3 "bobsrockets.sqlite3" ".tables");

Cool! We can make it even better with some simple improvements though:

  1. It's a pain to have to edit the script every time we want to change the database we're exporting
  2. It would be nice to be able to specify the output directory without editing the script too

Satisfying both of these points isn't particularly challenging. 10 minutes of fiddling got this the final completed script:

#!/usr/bin/env bash
set -e; # Don't allow errors

show_usage() {
    echo -e "Usage:";
    echo -e "\t./ {db_filename} {output_dir}";

log() {
    echo -e "[ $(date +"%F %T") ] ${@}";



if [ -z "${db_filename}" ]; then
    echo "Error: No database filename specified.";
    show_usage; exit;
if [ -z "${output_dir}" ]; then
    echo "Error: No output directory specified.";
    show_usage; exit;

if [ ! -d "${output_dir}" ]; then
    mkdir -p "${output_dir}"; 

log "Output directory is ${output_dir}";

while read table_name; do
    log "Exporting ${table_name}";

    sqlite3 -bail -header -csv "${db_filename}" "SELECT * FROM '${table_name}';" >"${output_dir}/${table_name}.csv";    
done < <(sqlite3 "${db_filename}" ".tables");

log "Complete!";

Found this useful? Comment below!

Summer Project Part 5: When is a function not a function?

Another post! Looks like I'm on a roll in this series :P

In the last post, I looked at the box I designed that was ready for 3D printing. That process has now been completed, and I'm now in possession of an (almost) luminous orange and pink box that could almost glow in the dark.......

I also looked at the libraries that I'll be using and how to manage the (rather limited) amount of memory available in the AVR microprocessor.

Since last time, I've somehow managed to shave a further 6% program space off (though I'm not sure how I've done it), so most recently I've been implementing 2 additional features:

  • An additional layer of AES encryption, to prevent The Things Network for having access to the decrypted data
  • GPS delta checking (as I'm calling it), to avoid sending multiple messages when the device hasn't moved

After all was said and done, I'm now at 97% program space and 47% global variable RAM usage.

To implement the additional AES encryption layer, I abused LMiC's IDEETRON AES-128 (ECB mode) implementation, which is stored in src/aes/ideetron/AES-128_V10.cpp.

It's worth noting here that if you're doing crypto yourself, it's seriously not recommended that you use ECB mode. Please don't. The only reason that I used it here is because I already had an implementation to hand that was being compiled into my program, I didn't have the program space to add another one, and my messages all start with a random 32-bit unsigned integer that will provide a measure of protection against collision attacks and other nastiness.

Specifically, it's the method with this signature:

void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

Since this is an internal LMiC function declared in a .cpp source file with no obvious header file twin, I needed to declare the prototype in my source code as above - as the method will only be discovered by the compiler when linking the object files together (see this page for more information about the C++ compilation process. While it's for regular Linux executable binaries, it still applies here since the Arduino toolchain spits out a very similar binary that's uploaded to the microprocessor via a programmer).

However, once I'd sorted out all the typing issues, I slammed into this error:

/tmp/ccOLIbBm.ltrans0.ltrans.o: In function `transmit_send':
sketch/transmission.cpp:89: undefined reference to `lmic_aes_encrypt(unsigned char*, unsigned char*)'
collect2: error: ld returned 1 exit status

Very strange. What's going on here? I declared that method via a prototype, didn't I?

Of course, it's not quite that simple. The thing is, the file I mentioned above isn't the first place that a prototype for that method is defined in LMiC. It's actually in other.c, line 35 as a C function. Since C and C++ (for all their similarities) are decidedly different, apparently to call a C function in C++ code you need to declare the function prototype as extern "C", like this:

extern "C" void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

This cleaned the error right up. Turns out that even if a function body is defined in C++, what matters is where the original prototype is declared.

I'm hoping to release the source code, but I need to have a discussion with my supervisor about that at the end of the project.

Found this interesting? Come across some equally nasty bugs? Comment below!

Own your code, Part 2: The curious case of the unreliable webhook

In the last post, I talked about how to setup your own Git server with Gitea. In this one, I'm going to take bit of a different tack - and talk about one of the really annoying problems I ran into when setting up my continuous integration server, Laminar CI.

Since I wanted to run the continuous integration server on a different machine to the Gitea server itself, I needed a way for the Gitea server to talk to the CI server. The natural choice here is, of course, a Webhook-based system.

After installing and configuring Webhook on the CI server, I set to work writing a webhook receiver shell script (more on this in a future post!). Unfortunately, it turned out that that Gitea didn't like sending to my CI server very much:

A ton of failed attempts at sending a webhook to the CI server

Whether it succeeded or not was random. If I hit the "Test Delivery" button enough times, it would eventually go through. My first thought was to bring up the Gitea server logs to see if it would give any additional information. It claimed that there was an i/o timeout communicating with the CI server:

Delivery: Post read tcp>x.y.z.w:443: i/o timeout

Interesting, but not particularly helpful. If that's the case, then I should be able to get the same error with curl on the Gitea server, right?


.....wrong. It worked flawlessly. Every time.

Not to be beaten by such an annoying issue, I moved on to my next suspicion. Since my CI server is unfortunately behind NAT, I checked the NAT rules on the router in front of it to ensure that it was being exposed correctly.

Unfortunately, I couldn't find anything wrong here either! By this point, it was starting to get really rather odd. As a sanity check, I decided to check the server logs on the CI server, since I'm running Webhook behind Nginx (as a reverse-proxy): - - [04/Dec/2018:20:48:05 +0000] "POST /hooks/laminar-config-check HTTP/1.1" 408 0 "-" "GiteaServer"

Now that's weird. Nginx has recorded a HTTP 408 error. Looking is up reveals that it's a Request Timeout error, which has the following definition:

The server did not receive a complete request message within the time that it was prepared to wait.

Wait what? Sounds to me like there's an argument going on between the 2 servers here - in which each server is claiming that the other didn't send a complete request or response.

At this point, I blamed this on a faulty HTTP implementation in Gitea, and opened an issue.

As a workaround, I ended up configuring Laminar to use a Unix socket on disk (as opposed to an abstract socket), forwarding it over SSH, and using a git hook to interact with it instead (more on how I managed this in a future post. There's a ton of shell scripting that I need to talk about first).

This isn't the end of this tail though! A month or two after I opened the issue, I wound up in the situation whereby I wanted to connect a GitHub repository to my CI server. Since I don't have shell access on, I had to use the webhook.

When I did though, I got a nasty shock: The webhook deliveries exhibited the exact same random failures as I saw with the Gitea webhook. If I'd verified the Webhook server and cleared Gitea's HTTP implementation's name, then what else could be causing the problem?

At this point, I can only begin to speculate what the issue is. Personally, I suspect that it's a bug in the port-forwarding logic of my router, whereby it drops the first packet from a new IP address while it sets up a new NAT session to forward the packets to the CI server or something - so subsequent requests will go through fine, so long as they are sent within the NAT session timeout and from the same IP. If you've got a better idea, please comment below!

Of course, I really wanted to get the GitHub repository connected to my CI server, and if the only way I could do this was with a webhook, it was time for some request-wrangling.

My solution: A PHP proxy script running on the same server as the Gitea server (since it has a PHP-enabled web server set up already). If said script eats the request and emits a 202 Accepted immediately, then it can continue trying to get a hold of the webhook on the CI server 'till the cows come home - and GitHub will never know! Genius.

PHP-FPM (the fastcgi process manager; great alongside Nginx) makes this possible with the fastcgi_finish_request() method, which both flushes the buffer and ends the request to the client, but doesn't kill the PHP script - allowing for further processing to take place without the client having to wait.

Extreme caution must be taken with this approach however, as it can easily lead to a situation where the all the PHP-FPM processes are busy waiting on replies from the CI server, leaving no room for other requests to be fulfilled and a big messy pile-up in the queue forming behind them.

Warnings aside, here's what I came up with:


$settings = [
    "target_url" => "",
    "response_message" => "Processing laminar job proxy request.",
    "retries" => 3,
    "attempt_timeout" => 2 // in seconds, for a single attempt

$headers = "host:\r\n";
foreach(getallheaders() as $key => $value) {
    if(strtolower($key) == "host") continue;
    $headers .= "$key: $value\r\n";
$headers .= "\r\n";

$request_content = file_get_contents("php://input");

// --------------------------------------------

header("content-type: text/plain");
header("content-length: " . strlen($settings["response_message"]));


// --------------------------------------------

function log_message($msg) {
    file_put_contents("ci-requests.log", $msg, FILE_APPEND);

for($i = 0; $i < $settings["retries"]; $i++) {
    $start = microtime(true);

    $context = stream_context_create([
        "http" => [
            "header" => $headers,
            "method" => "POST",
            "content" => $request_content,
            "timeout" => $settings["attempt_timeout"]

    $result = file_get_contents($settings["target_url"], false, $context);

    if($result !== false) {
        log_message("[" . date("r") . "] Queued laminar job in " . (microtime(true) - $start_time)*1000 . "ms");

    log_message("[" . date("r") . "] Failed to laminar job after " . (microtime(true) - $start_time)*1000 . "ms.");

I've named it autowrangler.php. A few things of note here:

  • php://input is a special virtual file that's mapped internally by PHP to the client's request. By eating it with file_get_contents(), we can get the entire request body that the client has sent to us, so that we can forward it on to the CI server.
  • getallheaders() lets us get a hold of all the headers sent to us by the client for later forwarding
  • I use log_message() to keep a log of the successes and failures in a log file. So far I've got a ~32% failure rate, but never more than 1 failure in a row - giving some credit to my earlier theory I talked about above.

This ends the tale of the recalcitrant and unreliable webhook. Hopefully you've found this an interesting read. In future posts, I want to look at how I configured Webhook, the inner workings of the git hook I mentioned above, and the collection of shell scripts I've cooked to that make my CI server tick in a way that makes it easy to add new projects quickly.

Found this interesting? Run into this issue yourself? Found a better solution workaround? Comment below!

Note to self: Don't reboot the server at midnight....

You may (or may not) have noticed a small window of ~3/4 hour the other day when my website was offline. I thought I'd post about the problem, the solution, and what I'll try to avoid next time.

The problem occurred when I was about to head to bed late at night. I decided to quickly reboot the server to reboot into a new kernel to activate some security updates.

I have this habit of leaving a ping -O hostname running in a separate terminal to monitor the progress of the reboot. I'm glad I did so this time, as I noticed that it took a while to go down for rebooting. Then it took an unusually long time to come up again, and when it did, I couldn't SSH in again!

After a quick check, the website was down too - so it was time to do something about it and fast. Thankfully, I already knew what was wrong - it was just a case of fixing it.....

In a Linux system, there's a file called /etc/fstab that defines all the file systems that are to be mounted. While this sounds a bit counter-intuitive (since how does it know to mount the filesystem that the file itself described how to mount?), it's built into the initial ramdisk (also this) if I understand it correctly.

There are many different types of file system in Linux. Common ones include ext4 (the latest Linux filesystem), nfs (Network FileSystem), sshfs (for mounting remote filesystems over SSH), davfs (WebDav shares), and more.

Problems start to arise when some of the filesystems defined in /etc/fstab don't mount correctly. WebDav filesystems are notorious for this, I've found - so they generally need to have the noauto flag attached, like this:   /path/to/mount/point    davfs   noauto,user,rw,uid=1000,gid=1000    0   0

Unfortunately, I forgot to do this with the webdav filesystem I added a few weeks ago, causing the whole problem in the first place.

The unfortunate issue was that since it couldn't mount the filesystems, jt couldn't start the SSH server. If it couldn't start the SSH server, I couldn't get in to fix it!

Kimsufi rescue mode to the, erm rescue! It turned out that my provider, KimSufi, have a rescue mode system built-in for just this sort of occasion. At the click of a few buttons, I could reboot my server into a temporary rescue environment with a random SSH password.

Therein I could mount the OS file system, edit /etc/fstab, and reboot into normal mode. Sorted!

Just a note for future reference: I recommend using the rescuepro rescue mode OS, and not either of the FreeBSD options. I had issues trying to mount the OS disk with them - I kept getting an Invalid argumennt error. I was probably doing something wrong, but at the time I didn't really want to waste tones of time trying to figure that out in an unfamiliar OS.

Hopefully there isn't a next time. I'm certainly going to avoid auto webdav mounts, instead spawning a subprocess to mount them in the background after booting is complete.

I'm also going to avoid rebooting my server when I don't have time to deal with anyn potential fallout....

Art by Mythdael