Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Installing Python, Keras, and Tensorflow from source

I found myself in the interesting position recently of needing to compile Python from source. The reasoning behind this is complicated, but it boils down to a need to use Python with Tensorflow / Keras for some natural language processing AI, as Tensorflow.js isn't going to cut it for the next stage of my PhD.

The target upon which I'm aiming to be running things currently is Viper, my University's high-performance computer (HPC). Unfortunately, the version of Python on said HPC is rather old, which necessitated obtaining a later version. Since I obviously don't have sudo permissions on Viper, I couldn't use the default system package manager. Incredibly, pre-compiled Python binaries are not distributed for Linux either, which meant that I ended up compiling from source.

I am going to be assuming that you have a directory at $HOME/software in which we will be working. In there, there should be a number of subdirectories:

  • bin: For binaries, already added to your PATH
  • lib: For library files - we'll be configuring this correctly in this guide
  • repos: For git repositories we clone

Make sure you have your snacks - this was a long ride to figure out and write - and it's an equally long ride to follow. I recommend reading this all the way through before actually executing anything to get an overall idea as to the process you'll be following and the assumptions I've made to keep this post a reasonable length.

Setting up

Before we begin, we need some dependencies:

  • gcc - The compiler
  • git - For checking out the cpython git repository
  • readline - An optional dependency of cpython (presumably for the REPL)

On Viper, we can load these like so:

module load utilities/multi
module load gcc/10.2.0
module load readline/7.0

Compiling openssl

We also need to clone the openssl git repo and build it from source:

cd ~/software/repos
git clone git://git.openssl.org/openssl.git;    # Clone the git repo
cd openssl;                                     # cd into it
git checkout OpenSSL_1_1_1-stable;              # Checkout the latest stable branch (do git branch -a to list all branches; Python will complain at you during build if you choose the wrong one and tell you what versions it supports)
./config;                                       # Configure openssl ready for compilation
make -j "$(nproc)"                              # Build openssl

With openssl compiled, we need to copy the resulting binaries to our ~/software/lib directory:

cp lib*.so* ~/software/lib;
# We're done, cd back to the parent directory
cd ..;

To finish up openssl, we need to update some environment variables to let the C++ compiler and linker know about it, but we'll talk about those after dealing with another dependency that Python requires.

Compiling libffi

libffi is another dependency of Python that's needed if you want to use Tensorflow. To start, go to the libgffi GitHub releases page in your web browser, and copy the URL for the latest release file. It should look something like this:

https://github.com/libffi/libffi/releases/download/v3.3/libffi-3.3.tar.gz

Then, download it to the target system:

cd ~/software/lib
curl -OL URL_HERE

Note that we do it this way, because otherwise we'd have to run the autogen.sh script which requires yet more dependencies that you're unlikely to have installed.

Then extract it and delete the tar.gz file:

tar -xzf libffi-3.3.tar.gz
rm libffi-3.3.tar.gz

Now, we can configure and compile it:

./configure --prefix=$HOME/software
make -j "$(nproc)"

Before we install it, we need to create a quick alias:

cd ~/software;
ln -s lib lib64;
cd -;

libffi for some reason likes to install to the lib64 directory, rather than our pre-existing lib directory, so creating an alias makes it so that it installs to the right place.

Updating the environment

Now that we've dealt with the dependencies, we now need to update our environment so that the compiler knows where to find them. Do that like so:

export LD_LIBRARY_PATH="$HOME/software/lib:${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}";
export LDFLAGS="-L$HOME/software/lib -L$HOME/software/include $LDFLAGS";
export CPPFLAGS="-I$HOME/software/include -I$HOME/software/repos/openssl/include -I$HOME/software/repos/openssl/include/openssl $CPPFLAGS"

It is also advisable to update your ~/.bashrc with these settings, as you may need to come back and recompile a different version of Python in the future.

Personally, I have a file at ~/software/setup.sh which I run with source $HOME/software/setuop.sh in my ~/.bashrc file to keep things neat and tidy.

Compiling Python

Now that we have openssl and libffi compiled, we can turn our attention to Python. First, clone the cpython git repo:

git clone https://github.com/python/cpython.git
cd cpython;

Then, checkout the latest tag. This essentially checks out the latest stable release:

git checkout "$(git tag | grep -ivP '[ab]|rc' | tail -n1)"

Important: If you're intention is to use tensorflow, check the Tensorflow Install page for supported Python versions. It's probable that it doesn't yet support the latest version of Python, so you might need to checkout a different tag here. For some reason, Python is really bad at propagating new versions out to the community quickly.

Before we can start the compilation process, we need to configure it. We're going for performance, so execute the configure script like so:

./configure --with-lto --enable-optimizations --with-openssl=/absolute/path/to/openssl_repo_dir

Replace /absolute/path/to/openssl_repo with the absolute path to the above openssl repo.

Now, we're ready to compile Python. Do that like so:

make -j "$(nproc)"

This will take a while, but once it's done it should have built Python successfully. For a sanity check, we can also test it like so:

make -j "$(nproc)" test

The Python binary compiled should be called simply python, and be located in the root of the git repository. Now that we've compiled it, we need to make a few tweaks to ensure that our shell uses our newly compiled version by default and not the older version from the host system. Personally, I keep my ~/bin folder under version control, so I install host-specific to ~/software, and put ~/software/bin in my PATH like so:

export PATH=$HOME/software/bin

With this in mind, we need to create some symbolic links in ~/software/bin that point to our new Python installation:

cd $HOME/software/bin;
ln -s relative/path/to/python_binary python
ln -s relative/path/to/python_binary python3
ln -s relative/path/to/python_binary python3.9

Replace relative/path/to/python_binary with the relative path tot he Python binary we compiled above.

To finish up the Python installation, we need to get pip up and running, the Python package manager. We can do this using the inbuilt ensurepip module, which can bootstrap a pip installation for us:

python -m ensurepip --user

This bootstraps pip into our local user directory. This is probably what you want, since if you try and install directly the shebang incorrectly points to the system's version of Python, which doesn't exist.

Then, update your ~/.bash_aliases and add the following:

export LD_LIBRARY_PATH=/absolute/path/to/openssl_repo_dir/lib:$LD_LIBRARY_PATH;
alias pip='python -m pip'
alias pip3='python -m pip'

...replacing /absolute/path/to/openssl_repo_dir with the path to the openssl git repo we cloned earlier.

The next stage is to use virtualenv to locally install our Python packages that we want to use for our project. This is good practice, because it keeps our dependencies locally installed to a single project, so they don't clash with different versions in other projects.

Before we can use virtualenv though, we have to install it:

pip install virtualenv

Unfortunately, Python / pip is not very clever at detecting the actual Python installation location, so in order to actually use virtualenv, we have to use a wrapper script - because the [shebang]() in the main ~/.local/bin/virtualenv entrypoint does not use /usr/bin/env to auto-detect the python binary location. Save the following to ~/software/bin (or any other location that's in your PATH ahead of ~/.local/bin):

#!/usr/bin/env bash

exec python ~/.local/bin/virtualenv "$@"

For example:

# Write the script to disk
nano ~/software/bin/virtualenv;
# chmod it to make it executable
chmod +x ~/software/bin/virtualenv

Installing Keras and tensorflow-gpu

With all that out of the way, we can finally use virtualenv to install Keras and tensorflow-gpu. Let's create a new directory and create a virtual environment to install our packages in:

mkdir tensorflow-test
cd tensorflow-test;
virtualenv "$PWD";
source bin/activate;

Now, we can install Tensorflow & Keras:

pip install tensorflow-gpu

It's worth noting here that Keras is a dependency of Tensorflow.

Tensorflow has a number of alternate package names you might want to install instead depending on your situation:

  • tensorflow: Stable tensorflow without GPU support - i.e. it runs on the CPU instead.
  • tf-nightly-gpu: Nightly tensorflow for the GPU. Useful if your version of Python is newer than the version of Python supported by Tensorflow

Once you're done in the virtual environment, exit it like this:

deactivate

Phew, that was a huge amount of work! Hopefully this sheds some light on the maddenly complicated process of compiling Python from source. If you run into issues, you're welcome to comment below and I'll try to help you out - but you might be better off asking the Python community instead, as they've likely got more experience with Python than I have.

Sources and further reading

simple-dash fork: now with directory support!

A while back (I still have all sorts of projects I've forgotten to blog about - with many more to come), I forked an excellent project called simple-dash, which is a web dashboard. You can configure it to display 1 or more links, and it presents them nice and cleanly in the middle of the page.

I don't make forks lightly, but in this case I liked the project a lot - but I wanted to add enough features that I felt that I might be taking it in a different direction than the original project. The original project also hasn't been touched in 2+ years, and the author hasn't had any contributions on GitHub in that time either - so think it's fair to say that it's unlikely that any pull request I open wouldn't be looked at either (if the original author is reading this, I'm happy to open one!).

Anyway, before I continue too far, here's a screenshot of my improvements in action:

A screenshot of my improvements - explained in more detail below.

I use simple-dash in multiple places to provide a dashboard of links to the various services that I run so I both don't lose them and, in some cases, other people in my family can easily access said services.

I added a number of features here. The first is invisible, but I completely re-implemented the layout to use the CSS Grid (see also: a, b). If you've played with CSS before but aren't yet aware of the CSS grid yet - I can thoroughly recommend you take a moment to investigate - it will blow you away and solve all your layout problems all at the same time! In short, it's like a 2d version of the flexbox.

Since the original has full mobile support, I continue that trend in the rewrite with some CSS media queries to change the number of items per row based on the width of your screen.

The other invisible change is that I changed the language the configuration file is written in to TOML, which is a much more friendly language to write configuration files in.

Anyway, in terms of more visible changes, I also added the ability to set a background image, as well as the default random triangles background. Icons also got the same treatment - gaining the ability to display an image instead of a Font Awesome icon (I haven't actually used Font Awesome before, so this was an interesting experience - even if it was already setup in this project).

Last but certainly not least, I added the ability group pages into folders. Here's a screenshot of what the contents of that folder in the top left looks like when opened:

simple-dash with a folder open

You can't see it here, but it's even animated! Link to a demo at the end of this post.

There were a number of different challenges to overcome to get this working right actually - it was not trivial at all. There are 2 components to it: The CSS to style it, and the Javascript to fiddle the class list on the folder itself to add / remove the active class so that I could distinguish between open and closed folders in the CSS, and also prevent the click event from propagating through to the <a href="https://example.com/">links</a> links when the folder is closed.

Thinking about it, it may be possible use a clever pointer-events: none to avoid the Javascript.

The CSS does the heavy lifting here though. For inactive folders, I use a CSS grid with overflow: none to display the 1st 4 icons in a preview. When the folder becomes active, position: fixed breaks it out of the layout of the rest of the page (sadly leaving a placeholder behind would require an additional html element), and the content reflows to use the same CSS as the main grid of tiles.

Through some CSS grid wizardry (you can do anything with CSS grid, it's amazing) and a container element, I can even fade out the rest of the page while the folder is open.

Clicking on the items in a folder when the folder is open takes you to their destination as usual, while clicking anywhere else closes the folder again.

I've got a demo running over here if you'd like to play around with it:

sbrl's simple-dash fork demo

The background is set to a random image from Unsplash. It loads fine for me, but sometimes it takes a moment.

If this looks like something, you'd like to use for yourself, my fork is open-source! Check it out here:

sbrl/simple-dash on GitHub

You can find instructions on how to set it up for yourself in the README. You'll need npm to install dependencies - this should come bundled with Node.js. You can also find a lovingly-commented example configuration file here:

config.sample.toml

If you have any difficulties setting it up, want to request a feature, or even (gasp!) report a bug, please open an issue. While I do monitor the comments here on this blog, GitHub issues are a much better place to track bugs and feature requests.

applause-cli: A Node.js CLI handling library

Continuing in the theme of things I've forgotten to talk about, I'd like to post about another package I've released a little while ago. I've been building a number of command line interfaces for my PhD, so I thought it would be best to use a library for this function.

I found [clap](), but it didn't quite do what I wanted - so I wrote my own inspired by it. Soon enough I needed to use the code in several different projects, so I abstracted the logic for it out and called it applause-cli, which you can now find on npm.

It has no dependencies, and it allows you do define a set of arguments and have it parsed out the values from a given input array of items automatically. Here's an example of how it works:

import Program from 'applause-cli';

let program = new Program("path/to/package.json");
program.argument("food", "Specifies the food to find.", "apple")
    .argument("count", "The number of items to find", 1, "number");

program.parse(process.argv.slice(2)); // Might return { food: "banana", count: 6 }

I even have automated documentation generated with documentation and uploaded to my website via Continuous Integration: https://starbeamrainbowlabs.com/code/applause-cli/. I've worked pretty hard on the documentation for this library actually - it even has integrated examples to show you how to use each function!

The library can also automatically generate help output from the provided information when the --help argument is detected too - though I have yet to improve the output if a subcommand is called (e.g. mycommand dostuff --help) - this is on my todo list :-)

Here's an example of the help text it automatically generates:

If this looks like something you'd be interested in using, I recommend checking out the npm package here: https://www.npmjs.com/package/applause-cli

For the curious, applause-cli is open-source under the MPL-2.0 licence. Find the code here: https://github.com/sbrl/applause-cli.

Saving power in Linux Systems

Hey there! It's an impromptu blog post. Originally I wrote this in response to this Reddit post, but it got rather longer than I anticipated and I ended up expanding on it just a teensy bit more and turning into this blog post.

Saving power in a Linux system can be necessary for a number of reasons, from reducing one's electricity bill to extending battery life.

There are a number of different factors to consider to reduce power usage, which I'll be talking about in this blog post. I will be assuming a headless Linux server for the purposes of this blog post, but these suggestions can be applicable to other systems too (if there's the demand I may write a follow up specifically about Arduino and ESP-based systems, as there are a number of tricks that can be applied there that don't work the same way for a full Linux system).

Of course, power usage is highly situationally dependant, and it's all about trade-offs: less convenience, increased complexity, and so on. The suggestions below are suggestions and rules of thumb that may or may not be applicable to your specific situation.

Hardware: Older hardware is less power efficient than newer hardware. So while using that 10yr old desktop as a server sounds like a great idea to reduce upfront costs, if your electricity is expensive it might be more cost-effective to buy a newer machine such as an Intel NUC or Raspberry Pi.

Even within the realms of Raspberry Pis, not every Raspberry Pi is created equal. If you need a little low-power outpost for counting cows in field with LoRa, then something like a Raspberry Pi Zero as a base might be more suitable than a fully Raspberry Pi 4B+ for example.

CPU architecture: Different CPU architectures have different performance / watt ratios. For example. AMD CPUs are - on the whole - more efficient than Intel CPUs as of 2021. What really matters here is the manufacturing size and density - e.g. a 7nm chip will be more power efficient than a 12nm or 14nm one.

ARM CPUs (e.g. Raspberry Pi and friends) are more efficient again (though the rule-of-thumb about manufacturing size & density does not hold true here). If you haven't yet bought any hardware for your next project, this is definitely worth considering.

Auto-on: Depending on your task, you might only need your device on for a short time each day. Most BIOSes will have a setting to automatically power on at a set time, so you could do this and then set the server to automatically power off when it has completed it's task.

Another consideration is automatically entering standby. This can be done with the rtcwake command. While not as power efficient as turning completely off, it should still net measurable power savings.

Firmware: Tools such as powertop (sudo apt install powertop on Debian-based systems) can help apply a number of optimisations. In the case of powertop, don't forget to add the optimisations you choose to your /etc/rc.local to auto-apply them on boot. Example things that you can optimise using powertop include:

  • Runtime power management for WiFi / Bluetooth
  • SATA power management

Disk activity: Again situationally dependent, but if you have a lot of disks attached to your server, reducing writes can have a positive impact on power usage. Tuning this is generally done with the hdparm command (sudo apt install hdparm). See this Unix Stack Exchange question, and also this Ask Ubuntu answer for more details on how this is done.

Software: Different applications will use different amounts of system resources, which in turn will consume different amounts of power. For example, GitLab is rather resource inefficient, but Gitea is much more efficient with resources. Objectively evaluating multiple possible candidate programs that solve your given problem is important if power savings are critical to your use-case.

Measuring resource usage over time (e.g. checking the CPU Time column in htop for example) is probably the most effective way of measuring this, though you'd want to devise an experiment where you run each candidate program in turn for a defined length of time and measure a given set of metrics - e.g. CPU time.

Measurement: Speaking of metrics, it's worth noting that while all these suggestions are interesting, you should absolutely measure the real power savings you get from implementing these suggestions. Some will give you more of a net gain for less work than others.

The best way I know of to do this is to use a power monitor like this one that I've bought previously and plugging your device into it, and then coming back a given amount of time later to record the total number of watt hours of electricity used. For USB devices such as the Raspberry Pi, if I remember rightly I purchased this device a while back, and it works rather well.

This will definitively tell you whether implementing a given measure will net you a significant decrease in power usage or not, which you can then weight against the effort required.

Users and access control in the Mosquitto MQTT server

A while ago, I blogged about how to setup an MQTT server with Mosquitto. In this one, I want to talk about how to setup multiple user accounts and how to implement access control.

In this post, I'll assume that you've already followed my previous post to which I've linked above.

User accounts

User accounts are a great security measure, as they prevent anyone without a password from accessing your MQTT server. Thankfully, they are pretty easy to do too - you just need a user / password file, and a directive in the main mosquitto.conf file to get it to read from it.

First, let's create a new users file:

sudo touch /etc/mosquitto/mosquitto_users
sudo chown mosquitto:mosquitto /etc/mosquitto/mosquitto_users
sudo chmod 0640 /etc/mosquitto/mosquitto_users

Then you can create new users like this:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users new_username_1

...replacing new_username_1 with the username of the new account you want to create. Upon executing the above, it will prompt you to enter a new password. Personally I use Keepass2 for this purpose, but you can create good passwords on the command line directly too:

dd if=/dev/urandom bs=1 count=20 | base64 | tr -d '+/='

Now that we have a users file, we can tell mosquitto about it. Add the following to your /etc/mosquitto/mosquitto.conf file:

# Require a username / password to connect
allow_anonymous false
# ....which are stored in the following file
password_file /etc/mosquitto/mosquitto_users

This disables anonymous access, and tells mosquitto where the the username / password file.

In future if you want to delete a user, do that like this:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users -D new_username_1

Access control

Access control is similar to user accounts. First, we need an access control file - which describes who can access what - and then we need a directive in the mosquitto.conf file to tell Mosquitto about it. Let's start with that access control file. Mine is located at /etc/mosquitto/mosquitto_acls.

# Directives here affect anonymous users, but we've disabled anonymous access

user username_here
topic readwrite foo/#

user bob
topic read rockets/status

There are 2 parts to the ACL file. First, the user directive sets the current user for which any following topic directives apply.

The topic directive allows the current user to read, write, or readwrite (both at the same time) a given topic. MQTT as a protocol is built on the idea of publishing (writing) to or subscribing (reading from) topics. Mosquitto assumes that a user has no access at all unless 1 or more topic directives are present to allow access.

The topic directive is comprised of 3 parts. First, the word topic is the name of the directive.

Next, any 1 of the following words declares what kind of access is being granted:

  • read: Read-only access
  • write: Write-only access
  • readwrite: Both read and write access

Finally, the name of the topic that is being affected by the access rule is given. This may include a hash symbol (#) as a wildcard. For example, rockets/status would affect only that specific topic, but space/# would affect all topics that start with space/.

Here are some more examples:

# Allow read access to "my_app/news"
topic read my_app/news

# Allow write access to "rockets/status"
topic write rockets/status

# Allow read and write access to everything under "another_app/"
topic readwrite another_app/#

Once you've created your ACL file, add this to your mosquitto.conf (being careful to put it before any listener directives if you have TLS / MQTTS support enabled):

acl_file /etc/mosquitto/mosquitto_acls

This will tell Mosquitto about your new access control file.

Reloading changes

After making changes above, you'll want to tell Mosquitto to reload the configuration file. Do that like this:

sudo systemctl reload mosquitto-mqtt.service

If your systemd service file doesn't support reloading, then a restart will do. Alternatively, add this to your systemd service file to the [Service] section:

ExecReload=/bin/kill -s HUP $MAINPID

Conclusion

In this tutorially-kinda post, I've talked through how to manage user accounts for the Mosquitto MQTT. I've also talked about how to enable and manage access control lists too.

This should make your MQTT server more secure. The other thing you can do to make your MQTT server more secure is enable TLS encryption. I'm going to hold off on showing that in this file because I'm still unsure about the best way of doing it (getting Mosquitto to do it vs using Nginx as a reverse proxy - I'm currently testing the former), but if there's the demand I'll post about it in the future.

Rendering Time plan / Gantt charts: hourgraph

I have a number of tools and other programs I've implemented, but forgotten to blog about here - hourgraph is one such tool I stumbled across today again. Originally I implemented it for my PhD panel 1 topic project analysis report, as I realised that not only have I manually created a number of these, but I'm going to have to create a bunch more in the future, but I open-sourced it as I usually do with most of the things I write in the hopes that someone else will find it useful.

I've published it on NPM, so you can install it like this:

npm install --global hourgraph

You'll need Node.js installed, and Linux users will need to prefix the above with sudo.

The program takes in a TOML definition file. Here's an example:

width = 1500
height = 480
title = "Apples"

[[task]]
name = "Pick apples"
start = 0
duration = 3

[[task]]
name = "Make apple juice"
start = 2
duration = 2

[[task]]
name = "Enjoy!"
start = 4
duration = 4
colour = "hsl(46, 90%, 60%)"
ghost_colour = "hsla(46, 90%, 60%, 0.1)"

The full set of options are available in the default config file, which is loaded in to fill in any gaps of things you haven't specified in your custom file.

Comprehensive usage instructions are found in the README, but you can render a new time plan chart thingy like this:

hourgraph --input path/to/input.toml --output path/to/output.toml

The above renders to this:

Hourgraph output

Personally, I find it's much easier to create charts like this by defining them in a simple text file that is then rendered into the actual thing. That way, I don't have to fiddle with the layout myself - it all comes out in the wash automatically.

For those interested in the code, it can be found here: https://github.com/sbrl/hourgraph

Skyliner: Automated text document outlining

When editing large documents, it's often helpful to have a hierarchical "navigation view" of sorts. For a text document like the Markdown that I'm typing now for this blog post, it would consist of a list of headings in the document. For a Javascript or C♯ file, it might consist of classes, functions, and methods.

Either way, it's a helpful thing to have - but for some ridiculous reason as far as I know a generic text document outlining tool doesn't exist.

While GitHub's Atom (my code editor of choice) has an atom-ide has an outline view that works great, it only supports a limited number of languages (you have to have a plugin installed for every language), and sometimes it hangs and takes like 30 seconds plus to generate the outline.

To this end, I intend to remedy the situation with a new library I'm writing call Skyliner.

It's based on finite-state automata and regular expressions (also, Wikipedia and regexper), and it streams the input and generates an outline line-by-line. It provides both a command-line interface and a Javascript API (though the command-line interface currently consumes all input before generating any output, but the library is capable of streaming objects as it consumes the input). As of the time of typing, it supports the following languages:

  • clike (e.g. c, c++, header files)
  • csharp
  • go
  • ini
  • javascript
  • json
  • lua
  • markdown
  • php
  • rust
  • sh (including bash)
  • toml
  • xml

While using regular expressions means that the output won't be perfect (especially in the case of XML/HTML), it does mean that adding support for a new language is as simple as defining a new finite-state automaton in a Javascript object. Adding support for Lua was a 10-15 minute job - including automated tests! Support for more languages is definitely on the way.

The combination of line-by-line parsing, regular expressions, and finite-state automata also means that it's much faster than Atom IDE, because it doesn't have to parse the entire document into an abstract syntax tree before generating the outline.

My goal here is to have good support as many languages as possible, rather than amazing support for just a small handful of languages. This isn't to say that I won't fix issues with existing languages as they come up, but the focus is really on ensuring that whenever I open a text document in Atom.

Once I've added support for a bunch of languages, written some documentation, and published it on npm, I'm going to try my hand it implementing my first plugin for GitHub's Atom. Apparently plugins for Atom aren't too difficult to port to Visual Studio Code, so maybe I'll take a look at doing that too (but I don't use Visual Studio Code much, so not a priority).

The code is open-source already (link below) and completely usable, but since it's an ongoing process you can expect another post on here in the future about my progress.

Skyliner on GitHub: https://github.com/sbrl/skyliner

If you'd be interested in some more detail about how it works, I'll be writing some documentation soon, which will appear in the above Git repository.

Booting my multiboot flash drive via EFI

In December 2019, I posted about my new multiboot flash drive that I put together. In that post, I partitioned a flash drive and installed grub to it such that I could boot from any one of a selection of ISOs that I put in a directory.

Here's a link to that post: Multi-boot + data + multi-partition = octopus flash drive 2.0?

This has been working brilliantly, except that I noticed after posting it that it had a tendency to hang infinitely if I booted into an ISO via UEFI instead of legacy BIOS. I've been hunting for a solution to this awkward issue ever since then - and as you might have guessed by the fact that you're reading this post now, I've finally found a solution.

As I suspected - the solution is really very simple:

# Ref https://askubuntu.com/q/1186040/139735
# As far as I can tell, this bug only affects UEFI / EFI
rmmod tpm

Adding rmmod tpm to the top of the grub.cfg grub configuration file fixes the issue - as far as I can tell (I still want to run some more tests on different machines).

I've also taken the unusual step of retroactively going back and updating the original post.

I found the solution in an answer on Ask Ubuntu. As it turns out, sometimes it's just a case of trying different search terms. I had tried a number of times to find a solution previously, but I kept coming up against solutions such as recompiling grub from source myself, which would require an enormous amount of effort, or suggestions to install grub from a different version of Ubuntu (which didn't work).

As far as I understand it, the solution disables the Trusted Platform Module support built into grub. As for why this fixes my ISO booting issue, I'm not sure.

If you're trying to find a solution to a problem you're encountering, never give up! You'll find a solution eventually :-)

PhD, Update 7: Just out of reach

Oops! I must have forgotten about writing an entry for this series. Things have been complicated with the current situation, but I've got some time now to talk about what's been happening since my last post about my PhD. Before we continue though, here's a list of all the parts so far:

In this post, there are 2 different distinct areas to talk about. Firstly, the (limited) progress I've made on the Temporal CNN - and secondly the social media content.

Rainfall Radar / Temporal CNN

Things on the Temporal CNN front have been.... interesting. In the last post, I talked about how I was planning to update the model to use a cross-entropy loss function instead of mean squared error. In short, the idea here is to bin the water depth values we want to predict into a number of different categories, and then get the AI to predict which category each pixel belongs in.

The point of this is to allow for evaluating what the model is good at, and what it struggles with more effectively with the help of a confusion matrix.

Unfortunately, after going to a considerable amount of effort, the model hasn't yet been able to learn anything at all when using the cross-entropy loss error function. I've tried a whole array of different things by this point:

  • Using sparse / non-sparse categorical cross-entropy loss (ref)
  • Changing the number of filters in the model
  • Changing the format of the input data
  • Moving from average pooling → max pooling
  • Fiddling with the 2D CNN layer at the end of the model
  • ...and many more things - too many to list or remember here

Unfortunately, none of these have had a meaningful impact on the model's ability to learn anything. Despite this, we haven't run out of ideas yet. My current plan is to rebuild the model based on a known good model.

The known-good model in question is one I built earlier for a talk I did. It's purpose is classifying images, which it does fabulously with the well-known MNIST handwriting digits dataset. It's structure has 1 2D CNN layer, followed by a dense layer that outputs the probabilities as a 1D array:

See above for explanation

I devised 2 learning tasks to test the model with here. The "hard" task, which is to predict the exact digit in the picture, and the "easy" task - which is to predict whether the digit is greater than or equal to 5 or not (this simulates a binary cross-entropy task I've been trying my original model with). The original model works brilliantly with the hard task, gaining an easy 98% accuracy after 12 epochs.

After forking it and then refactoring significantly to decouple its various components, I started to modify the model's structure step by step to more closely match that of the Temporal CNN.

The initial results of this process (which only really got going on Tuesday 23rd February 2021) have been fascinating, as I've been running the MNIST dataset through it in between each step to check that it's still working as intended.

For example, I've discovered that the model has an intense dislike of pooling layers (both average and max). I suspect this might be because I'm not using it correctly, but I discovered that I could only get about 40% accuracy with the pooling layer in place, compared to ~99% without it.

Another thing I've done is removing the dense layer from the model, but this comes with its own set of problems though. The eventual goal is to do what is essentially image-to-video translation, so a key part of this process is to get the model to produce at least 2D tensor as an output instead of a 1D list of predictions for a single pixel.

To simulate this with the MNIST dataset, copied the output prediction. For the "hard" task, I copied the array of probabilities for each category into a cube, with 1 copy of the array for each pixel of the output. I found while doing this though that I got about 22% accuracy - though I suspect that the model was slower to converge than normal and if I'd maybe made the model a bit larger or let it train for longer, I'd be able to improve that somewhat.

It fared much better on the "easy" task though - easily achieving 99% accuracy fairly quickly with just 2 x 2D CNN layers in a row.

With these tests in mind, I'll be continuing the process of tweaking my new model bit by bit to match the original Temporal CNN, with the eventual goal of running my actual dataset through the model.

Social Media

I thought I'd be well into the social media part of my PhD by now, but things have been getting in the way (e.g. life stuff with respect to the current situation, and the temporal cnn being awkward) so I haven't yet been able to make a serious start on the social media side of things yet.

Still, I've been working away at the paperwork. I've now got ethical approval to work with publicly-available social media data (so long as I anonymise it, of course), and I've also been applying to get access to Twitter's new Academic API (which apparently went through successfully, but I'm currently troubleshooting the reason why it's asking me to apply for an account all over again).

I've also been reading a paper or 2, but since most of my energy has been spent elsewhere I have yet to dive seriously into this (I re-discovered the other day a folder full of interesting papers my supervisor sent me, so I'm going to dive into that as soon as I get a moment).

Papers looking into analysing social media data with advanced AI models appear to be in short supply - most papers I've read so far are either talking about analysing longer texts such as newspaper articles, or are using keyword-based and statistics-based methodologies to analyse data.

While this makes for an interesting research gap, I do feel slightly nervous that I've somehow missed something (which I guess I'll find out soon enough after reading some more papers). At any rate, my supervisor and I have some promising ideas and directions to look into moving forwards, so I'm not too worried here. I've also had some interesting discussions with people from the humanities side of my PhD (if you can call it that? I'm not sure what the right terminology is) over potential research questions too, so there's lots of scope here for investigation.

I anticipate that social media data isn't going to be as difficult a dataset to wrangle as the rainfall radar either (it's got to be better than a badly documented propriety binary format), as it's already encoded in JSON - so I'm not expecting I'll need to spend ages and ages writing programs to reformat and parse the data.

Conclusion

Things have been moving slowly recently, due in part to difficulties with the Temporal CNN, and due in part to life in general suddenly becoming rather challenging recently. Things are starting to calm down now though, so I'm starting to have more time to work on my PhD (but it's going to be a number of months yet until things are properly back to normal).

By changing tack with the Temporal CNN, I feel like I'm starting to make some more progress again, and the social media track of my PhD is showing lots of promise even though it's too early to tell exactly what direction I'll be heading in with it.

Hopefully by the time I make another post here in 2 months time, I'll have a working Temporal CNN and a start to the social media side of things - but this seems a tad ambitious based on how things have been going so far.

If you've got any comments or suggestions, do leave them below! I'd love to hear from you.

Cluster, Part 11: Lock and Key | Let's Encrypt DNS-01 for wildcard TLS certificates

Welcome one and all to another cluster blog post! Cluster blog posts always take a while to write, so sorry for the delay. As is customary, let's start this post off with a list of all the parts in the series so far:

With that out of the way, in this post we're going to look at obtaining a wildcard TLS certificate using the Let's Encrypt DNS-01 challenge. We want this because you need a TLS certificate to serve HTTPS without lighting everyone's browsers up with warnings like a Christmas tree.

The DNS-01 challenge is an alternate challenge to the default HTTP-01 challenge you may already me familiar with.

Unlike the HTTP-01 challenge which proves you have access to single domain by automatically placing a file on your web server, the DNS-01 challenge proves you have control over an entire domain - thus allowing you to obtain a wildcard certificate - which is valid for not only your domain, but all possible subdomains! This should save a lot of hassle - but it's important we keep it secure too.

As with regular Let's Encrypt certificates, we'll also need to ensure that our wildcard certificate we obtain will be auto-renewed, so we'll be setting up a periodic task on our Nomad cluster to do this for us.

If you don't have a Nomad cluster, don't worry. It's not required, and I'll be showing you how to do it without one too. But if you'd like to set one up, I recommend part 7 of this series.

In order to complete the DNS-01 challenge successfully, we need to automatically place a DNS record in our domain. This can be done via an API, if your DNS provider has one and it's supported. Personally, I have the domain name I'm using for my cluster (mooncarrot.space.) with Gandi. We'll be using certbot to perform the DNS-01 challenge, which has a plugin system for different DNS API providers.

We'll be installing the challenge provider we need with pip3 (a Python 3 package manager, as certbot is written in Python), so you can find an up-to-date list of challenge providers over on PyPi here: https://pypi.org/search/?q=certbot-dns

If you don't see a plugin for your provider, don't worry. I couldn't find one for Gandi, so I added my domain name to Cloudflare and followed the setup to change the name servers for my domain name to point at them. After doing this, I can now use the Cloudflare API through the certbot-dns-cloudflare plugin.

With that sorted, we can look at obtaining that TLS certificate. I opt to put certbot in a Docker container here so that I can run it through a Nomad periodic task. This proved to be a useful tool to test the process out though, as I hit a number of snags with the process that made things interesting.

The first order of business is to install certbot and the associate plugins. You'd think that simply doing an sudo apt install certbot certbot-dns-cloudflare would do the job, but you'd be wrong.

As it turns out, it does install that way, but it installs an older version of the certbot-dns-cloudflare plugin that requires you give it your Global API Key from your Cloudflare account, which has permission to do anything on your account!

That's no good at all, because if the key gets compromised an attacker could edit any of the domain names on our account they like, which would quickly turn into a disaster!

Instead, we want to install the latest version of certbot and the associated Cloudflare DNS plugin, which support regular Cloudflare API Tokens, upon which we can set restrictive permissions to only allow it to edit the one domain name we want to obtain a TLS certificate for.

I tried multiple different ways of installing certbot in order to get a version recent enough to get it to take an API token. The way that worked for me was a script called certbot-auto, which you can download from here: https://dl.eff.org/certbot-auto.

Now we have a way to install certbot, we also need the Cloudflare DNS plugin. As I mentioned above, we can do this using pip3, a Python package manager. In our case, the pip3 package we want is certbot-dns-cloudflare - incidentally it has the same name as the outdated apt package that would have made life so much simpler if it had supported API tokens.

Now we have a plan, let's start to draft out the commands we'll need to execute to get certbot up and running. If you're planning on following this tutorial on bare metal (i.e. without Docker), go ahead and execute these directly on your target machine. If you're following along with Docker though, hang on because we'll be wrapping these up into a Dockerfile shortly.

First, let's install certbot:

sudo apt install curl ca-certificates
cd some_permanent_directory;
curl -sS https://dl.eff.org/certbot-auto -o certbot-auto
chmod +x certbot-auto
sudo certbot-auto --debug --noninteractive --install-only

Installation with certbot-auto comprises downloading a script and executing it. with a bunch of flags. Next up, we need to shoe-horn our certbot-dns-cloudflare plugin into the certbot-auto installation. This requires some interesting trickery here, because certbot-auto uses something called virtualenv to install itself and all its dependencies locally into a single directory.

sudo apt install python3-pip
cd /opt/eff.org/certbot/venv
source bin/activate
pip install certbot-dns-cloudflare
deactivate

In short, we cd into the certbot-auto installation, activate the virtualenv local environment, install our dns plugin package, and then exit out of the virtual environment again.

With that done, we can finally add a convenience synlink so that the certbot command is in our PATH:

ln -s /opt/eff.org/certbot/venv/bin/certbot /usr/bin/certbot

That completes the certbot installation process. Then, to use certbot to create the TLS certificate, we'll need an API as mentioned earlier. Navigate to the API Tokens part of your profile and create one, and then create an INI file in the following format:

# Cloudflare API token used by Certbot
dns_cloudflare_api_token = "YOUR_API_TOKEN_HERE"

...replacing YOUR_API_TOKEN_HERE with your API token of course.

Finally, with all that in place, we can create our wildcard certificate! Do that like this:

sudo certbot certonly --dns-cloudflare --dns-cloudflare-credentials path/to/credentials.ini -d 'bobsrockets.io,*.bobsrockets.io' --preferred-challenges dns-01

It'll ask you a bunch of interactive questions the first time you do this, but follow it through and it should issue you a TLS certificate (and tell you where it stored it). Actually utilising it is beyond the scope of this post - we'll be tackling that in a future post in this series.

For those following along on bare metal, this is where you'll want to skip to the end of the post. Before you do, I'll leave you with a quick note about auto-renewing your TLS certificates. Do this:

sudo letsencrypt renew
sudo systemctl reload nginx postfix

....on a regular basis, replacing nginx postfix with a space-separated list of services that need reloading after you've renewed your certificates. A great way to do this is to setup a cron job.

Sweeping things under the carpet

For the Docker users here, we aren't quite finished yet: We need to package this mess up into a nice neat Docker container where we can forget about it :P

Some things we need to be aware of:

  • certbot has a number of data directories it interacts with that we need to ensure don't get wiped when the Docker ends instances of our container.
  • Since I'm serving the shared storage of my cluster over NFS, we can't have certbot running as root as it'll get a permission denied error when it tries to access the disk.
  • While curl and ca-certificates are needed to download certbot-auto, they aren't needed by certbot itself - so we can avoid installing them in the resulting Docker container by using a multi-stage Dockerfile.

To save you the trouble, I've already gone to the trouble of developing just such a Dockerfile that takes all of this into account. Here it is:

ARG REPO_LOCATION
# ARG BASE_VERSION

FROM ${REPO_LOCATION}minideb AS builder

RUN install_packages curl ca-certificates \
    && curl -sS https://dl.eff.org/certbot-auto -o /srv/certbot-auto \
    && chmod +x /srv/certbot-auto

FROM ${REPO_LOCATION}minideb

COPY --from=builder /srv/certbot-auto /srv/certbot-auto

RUN /srv/certbot-auto --debug --noninteractive --install-only && \
    install_packages python3-pip

WORKDIR /opt/eff.org/certbot/venv
RUN . bin/activate \
    && pip install certbot-dns-cloudflare \
    && deactivate \
    && ln -s /opt/eff.org/certbot/venv/bin/certbot /usr/bin/certbot

VOLUME /srv/configdir /srv/workdir /srv/logsdir

USER 999:994
ENTRYPOINT [ "/usr/bin/certbot", \
    "--config-dir", "/srv/configdir", \
    "--work-dir", "/srv/workdir", \
    "--logs-dir", "/srv/logsdir" ]

A few things to note here:

  • We use a multi-stage dockerfile here to avoid installing curl and ca-certificates in the resulting docker image.
  • I'm using minideb as a base image that resides on my private Docker registry (see part 8). For the curious, the script I use to do this located on my personal git server here: https://git.starbeamrainbowlabs.com/sbrl/docker-images/src/branch/master/images/minideb.
    • If you don't have minideb pushed to a private Docker registry, replace minideb with bitnami/minideb in the above.
  • We set the user and group certbot runs as to 999:994 to avoid the NFS permissions issue.
  • We define 3 Docker volumes /srv/configdir, /srv/workdir, and /srv/logsdir to contain all of certbot's data that needs to be persisted and use an elaborate ENTRYPOINT to ensure that we tell certbot about them.

Save this in a new directory with the name Dockerfile and build it:

sudo docker build --no-cache --pull --tag "certbot" .;

...if you have a private Docker registry with a local minideb image you'd like to use as a base, do this instead:

sudo docker build --no-cache --pull --tag "myregistry.seanssatellites.io:5000/certbot" --build-arg "REPO_LOCATION=myregistry.seanssatellites.io:5000/" .;

In my case, I do this on my CI server:

laminarc queue docker-rebuild IMAGE=certbot

The hows of how I set that up will be the subject of a future post. Part of the answer is located in my docker-images Git repository, but the other part is in my private continuous integration Git repo (but rest assured I'll be talking about it and sharing it here).

Anyway, with the Docker container built we can now obtain our certificates with this monster of a one-liner:

sudo docker run -it --rm -v /mnt/shared/services/certbot/workdir:/srv/workdir -v /mnt/shared/services/certbot/configdir:/srv/configdir -v /mnt/shared/services/certbot/logsdir:/srv/logsdir certbot certonly --dns-cloudflare --dns-cloudflare-credentials path/to/credentials.ini -d 'bobsrockets.io,*.bobsrockets.io' --preferred-challenges dns-01

The reason this is so long is that we need to mount the 3 different volumes into the container that contain certbot's data files. If you're running a private registry, don't forget to prefix certbot there with registry.bobsrockets.com:5000/.

Don't forget also to update the Docker volume locations on the host here to point a empty directories owned by 999:994.

Even if you want to run this on Nomad, I still advise that you execute this manually. This is because the first time you do so it'll ask you a bunch of questions interactively (which it doesn't do on subsequent times).

If you're not using Nomad, this is the point you'll want to skip to the end. As before with the bare-metal users, you'll want to add a cron job that runs certbot renew - just in your case inside your Docker container.

Nomad

For the truly intrepid Nomad users, we still have one last task to complete before our work is done: Auto-renewing our certificate(s) with a Nomad periodic task.

This isn't really that complicated I found. Here's what I came up with:

job "certbot" {
    datacenters = ["dc1"]
    priority = 100
    type = "batch"

    periodic {
        cron = "@weekly"
        prohibit_overlap = true
    }

    task "certbot" {
        driver = "docker"

        config {
            image = "registry.service.mooncarrot.space:5000/certbot"
            labels { group = "maintenance" }
            entrypoint = [ "/usr/bin/certbot" ]
            command = "renew"
            args = [
                "--config-dir", "/srv/configdir/",
                "--work-dir", "/srv/workdir/",
                "--logs-dir", "/srv/logsdir/"
            ]
            # To generate a new cert:
            # /usr/bin/certbot --work-dir /srv/workdir/ --config-dir /srv/configdir/ --logs-dir /srv/logsdir/ certonly --dns-cloudflare --dns-cloudflare-credentials /srv/configdir/__cloudflare_credentials.ini -d 'mooncarrot.space,*.mooncarrot.space' --preferred-challenges dns-01

            volumes = [
                "/mnt/shared/services/certbot/workdir:/srv/workdir",
                "/mnt/shared/services/certbot/configdir:/srv/configdir",
                "/mnt/shared/services/certbot/logsdir:/srv/logsdir"
            ]
        }
    }
}

If you want to use it yourself, replace the various references to things like the private Docker registry and the Docker volumes (which require "docker.volumes.enabled" = "True" in clientoptions in your Nomad agent configuration) with values that make sense in your context.

I have some confidence that this is working as intended by inspecting logs and watching TLS certificate expiry times. Save it to a file called certbot.nomad and then run it:

nomad job run certbot.nomad

Conclusion

If you've made it this far, congratulations! We've installed certbot and used the Cloudflare DNS plugin to obtain a DNS wildcard certificate. For the more adventurous, we've packaged it all into a Docker container. Finally for the truly intrepid we implemented a Nomad periodic job to auto-renew our TLS certificates.

Even if you don't use Docker or Nomad, I hope this has been a helpful read. If you're interested in the rest of my cluster build I've done, why not go back and start reading from part 1? All the posts in my cluster series are tagged with "cluster" to make them easier to find.

Unfortunately, I haven't managed to determine a way to import TLS certificates into Hashicorp Vault automatically, as I've stalled a bit on the Vault front (permissions and policies are wildly complicated), so in future posts it's unlikely I'll be touching Vault any time soon (if anyone has an alternative that is simpler and easier to understand / configure, please comment below).

Despite this, in future posts I've got a number of topics lined up I'd like to talk about:

  • Configuring Fabio (see part 9) to serve HTTPS and force-redirect from HTTP to HTTPS (status: implemented)
  • Implementing HAProxy to terminate port forwarding (status: initial research)
  • Password protecting the private docker registry, Consul, and Nomad (status: on the todo list)
  • Semi-automatic docker image rebuilding with Laminar CI (status: implemented)

In the meantime, please comment below if you liked this post, are having issues, or have any suggestions. I'd love to hear if this helped you out!

Sources and Further Reading

Art by Mythdael