Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems performance photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures three thing game three.js tool tutorial twitter ubuntu university update upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Own your code, Part 2: The curious case of the unreliable webhook

In the last post, I talked about how to setup your own Git server with Gitea. In this one, I'm going to take bit of a different tack - and talk about one of the really annoying problems I ran into when setting up my continuous integration server, Laminar CI.

Since I wanted to run the continuous integration server on a different machine to the Gitea server itself, I needed a way for the Gitea server to talk to the CI server. The natural choice here is, of course, a Webhook-based system.

After installing and configuring Webhook on the CI server, I set to work writing a webhook receiver shell script (more on this in a future post!). Unfortunately, it turned out that that Gitea didn't like sending to my CI server very much:

A ton of failed attempts at sending a webhook to the CI server

Whether it succeeded or not was random. If I hit the "Test Delivery" button enough times, it would eventually go through. My first thought was to bring up the Gitea server logs to see if it would give any additional information. It claimed that there was an i/o timeout communicating with the CI server:

Delivery: Post https://ci.bobsrockets.com/hooks/laminar-config-check: read tcp 5.196.73.75:54504->x.y.z.w:443: i/o timeout

Interesting, but not particularly helpful. If that's the case, then I should be able to get the same error with curl on the Gitea server, right?

curl https://ci.bobsrockets.com/hooks/testhook

.....wrong. It worked flawlessly. Every time.

Not to be beaten by such an annoying issue, I moved on to my next suspicion. Since my CI server is unfortunately behind NAT, I checked the NAT rules on the router in front of it to ensure that it was being exposed correctly.

Unfortunately, I couldn't find anything wrong here either! By this point, it was starting to get really rather odd. As a sanity check, I decided to check the server logs on the CI server, since I'm running Webhook behind Nginx (as a reverse-proxy):

5.196.73.75 - - [04/Dec/2018:20:48:05 +0000] "POST /hooks/laminar-config-check HTTP/1.1" 408 0 "-" "GiteaServer"

Now that's weird. Nginx has recorded a HTTP 408 error. Looking is up reveals that it's a Request Timeout error, which has the following definition:

The server did not receive a complete request message within the time that it was prepared to wait.

Wait what? Sounds to me like there's an argument going on between the 2 servers here - in which each server is claiming that the other didn't send a complete request or response.

At this point, I blamed this on a faulty HTTP implementation in Gitea, and opened an issue.

As a workaround, I ended up configuring Laminar to use a Unix socket on disk (as opposed to an abstract socket), forwarding it over SSH, and using a git hook to interact with it instead (more on how I managed this in a future post. There's a ton of shell scripting that I need to talk about first).

This isn't the end of this tail though! A month or two after I opened the issue, I wound up in the situation whereby I wanted to connect a GitHub repository to my CI server. Since I don't have shell access on github.com, I had to use the webhook.

When I did though, I got a nasty shock: The webhook deliveries exhibited the exact same random failures as I saw with the Gitea webhook. If I'd verified the Webhook server and cleared Gitea's HTTP implementation's name, then what else could be causing the problem?

At this point, I can only begin to speculate what the issue is. Personally, I suspect that it's a bug in the port-forwarding logic of my router, whereby it drops the first packet from a new IP address while it sets up a new NAT session to forward the packets to the CI server or something - so subsequent requests will go through fine, so long as they are sent within the NAT session timeout and from the same IP. If you've got a better idea, please comment below!

Of course, I really wanted to get the GitHub repository connected to my CI server, and if the only way I could do this was with a webhook, it was time for some request-wrangling.

My solution: A PHP proxy script running on the same server as the Gitea server (since it has a PHP-enabled web server set up already). If said script eats the request and emits a 202 Accepted immediately, then it can continue trying to get a hold of the webhook on the CI server 'till the cows come home - and GitHub will never know! Genius.

PHP-FPM (the fastcgi process manager; great alongside Nginx) makes this possible with the fastcgi_finish_request() method, which both flushes the buffer and ends the request to the client, but doesn't kill the PHP script - allowing for further processing to take place without the client having to wait.

Extreme caution must be taken with this approach however, as it can easily lead to a situation where the all the PHP-FPM processes are busy waiting on replies from the CI server, leaving no room for other requests to be fulfilled and a big messy pile-up in the queue forming behind them.

Warnings aside, here's what I came up with:

<?php

$settings = [
    "target_url" => "https://ci.bobsrockets.com/hooks/laminar-git-repo",
    "response_message" => "Processing laminar job proxy request.",
    "retries" => 3,
    "attempt_timeout" => 2 // in seconds, for a single attempt
];

$headers = "host: ci.starbeamrainbowlabs.com\r\n";
foreach(getallheaders() as $key => $value) {
    if(strtolower($key) == "host") continue;
    $headers .= "$key: $value\r\n";
}
$headers .= "\r\n";

$request_content = file_get_contents("php://input");

// --------------------------------------------

http_response_code(202);
header("content-type: text/plain");
header("content-length: " . strlen($settings["response_message"]));
echo($settings["response_message"]);

fastcgi_finish_request();

// --------------------------------------------

function log_message($msg) {
    file_put_contents("ci-requests.log", $msg, FILE_APPEND);
}

for($i = 0; $i < $settings["retries"]; $i++) {
    $start = microtime(true);

    $context = stream_context_create([
        "http" => [
            "header" => $headers,
            "method" => "POST",
            "content" => $request_content,
            "timeout" => $settings["attempt_timeout"]
        ]
    ]);

    $result = file_get_contents($settings["target_url"], false, $context);

    if($result !== false) {
        log_message("[" . date("r") . "] Queued laminar job in " . (microtime(true) - $start_time)*1000 . "ms");
        break;
    }


    log_message("[" . date("r") . "] Failed to laminar job after " . (microtime(true) - $start_time)*1000 . "ms.");
}

I've named it autowrangler.php. A few things of note here:

  • php://input is a special virtual file that's mapped internally by PHP to the client's request. By eating it with file_get_contents(), we can get the entire request body that the client has sent to us, so that we can forward it on to the CI server.
  • getallheaders() lets us get a hold of all the headers sent to us by the client for later forwarding
  • I use log_message() to keep a log of the successes and failures in a log file. So far I've got a ~32% failure rate, but never more than 1 failure in a row - giving some credit to my earlier theory I talked about above.

This ends the tale of the recalcitrant and unreliable webhook. Hopefully you've found this an interesting read. In future posts, I want to look at how I configured Webhook, the inner workings of the git hook I mentioned above, and the collection of shell scripts I've cooked to that make my CI server tick in a way that makes it easy to add new projects quickly.

Found this interesting? Run into this issue yourself? Found a better solution workaround? Comment below!

Summer Project Part 2: Random Number Analysis with Gnuplot

In my last post about my Masters Summer Project, I talked about my plans and what I'm doing. In this post, I want to talk about random number generator evaluation.

As part of the Arduino-based Internet of Things device that will be collecting the data, I need to generate high-quality random numbers in order to ensure that the unique ids I use in my project are both unpredictable and unique.

In order to generate such numbers, I've found a library that exploits the jitter in the inbuilt watchdog timer that's present in the Arduino Uno. It's got a manual which is worth a read, as it explains the concepts behind it quite well.

After some experimenting, I ended up with a program that would generate random numbers as fast as it could:

// Generate_Random_Numbers - This sketch makes use of the Entropy library
// to produce a sequence of random integers and floating point values.
// to demonstrate the use of the entropy library
//
// Copyright 2012 by Walter Anderson
//
// This file is part of Entropy, an Arduino library.
// Entropy is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Entropy is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Entropy.  If not, see <http://www.gnu.org/licenses/>.
// 
// Edited by Starbeamrainbowlabs 2019

#include "Entropy.h"

void setup() {
    Serial.begin(9600);

    // This routine sets up the watch dog timer with interrupt handler to maintain a
    // pool of real entropy for use in sketches. This mechanism is relatively slow
    // since it will only produce a little less than two 32-bit random values per
    // second.
    Entropy.initialize();
}

void loop() {
    uint32_t random_long;
    random_long = Entropy.random();
    Serial.println(random_long);
}

As you can tell, it's based on one of the examples. You may need to fiddle around with the imports in order to get it to work, because the Arduino IDE is terrible.

With this in place and uploaded to an Arduino, all I needed to do was log the serial console output to a file. Thankfully, this is actually really quite easy on Linux:

screen -S random-number-generator dd if=/dev/ttyACM0 of=random.txt bs=1

Since I connected the Arduino in question to a Raspberry Pi I have acting as a file server, I've included a screen call here that ensures that I can close the SSH session without it killing the command I'm executing - retaining the ability to 'reattach' to it later to check on it.

With it set off, I left it for a day or 2 until I had at least 1 MiB of random numbers. Once it was done, I ended up with a file looking a little like this:

216767155
986748290
455286059
1956258942
4245729381
3339111661
1821899502
3892736709
3658303796
2524261768
732282824
999812729
1312753534
2810553575
246363223
4106522438
260211625
1375011617
795481000
319056836

(Want more? Download the entire set here.)

In total, it generated 134318 numbers for me to play with, which should be plenty to graph their distribution.

Graphing such a large amount of numbers requires a special kind of program. Since I've used it before, I reached for Gnuplot.

A histogram is probably the best kind of graph for this purpose, so I looked up how to get gnuplot to draw one and tweaked it for my own purposes.

I didn't realise that you could do arbitrary calculations inside a Gnuplot graph definition file, but apparently you can. The important bit below is the bin_width variable:

set key off
set border 3

# Add a vertical dotted line at x=0 to show centre (mean) of distribution.
#set yzeroaxis

# Each bar is half the (visual) width of its x-range.
#set boxwidth 0.05 absolute
#set style fill solid 1.0 noborder

bin_width = 171797751.16;

bin_number(x) = floor(x / bin_width)

rounded(x) = bin_width * ( bin_number(x) + 0.5 )

set terminal png linewidth 3 size 1920,1080

plot 'random.txt' using (rounded($1)):(1) smooth frequency with boxes

It specifies the width of each bar on the graph. To work this out, we need to know the maximum number in the dataset. Then we can divide it by the target number of bins to get the width thereof. Time for some awk!

awk 'BEGIN { a = 0; b=999999999999999 } { if ($1>0+a) a=$1; if ($1 < 0+b) b=$1; } END { print(a, b); }' <random.txt

This looks like a bit of a mess, so let's unwind that awk script so that we can take a better look at it.

BEGIN {
    max = 0;
    min = 999999999999999
}
{
    if ($1 > max)
        max = $1;
    if ($1 < min)
        min = $1;
}
END {
    print("Max:", max, "Min:", min);
}

Much better. In short, it keeps a record of the maximum and minimum numbers it's seen so far, and updates them if it sees a better one. Let's run that over our random numbers:

awk -f minmax.awk 

Excellent. 4294943779 ÷ 25 = 171797751.16 - which is how I arrived at that value for the bin_width earlier.

Now we can render our graph:

gnuplot random.plt >histogram.png && optipng -o7 histogram.png

I always optimise images with either optipng or jpegoptim to save on storage space on the server, and bandwidth for readers - and in this case the difference was particularly noticeable. Here's the final graph:

The histograph generated by the above command.

As you can see, the number of numbers in each bin is pretty even, so we can reasonably conclude that the algorithm isn't too terrible.

What about uniqueness? Well, that's much easier to test than the distribution. If we count the numbers before and after removing duplicates, it should tell us how many duplicates there were. There's even a special command for it:

wc -l <random.txt 
134318
sort <random.txt | uniq | wc -l
134317
sort <random.txt | uniq --repeated
1349455381

Very interesting. Out of ~134K numbers, there's only a single duplicate! I'm not sure whether that's a good thing or not, as I haven't profiled any other random number generated in this manner before.

Since I'm planning on taking 1 reading a minute for at least a week (that's 10080 readings), I'm not sure that I'm going to get close to running into this issue - but it's always good to be prepared I guess......

Found this interesting? Got a better way of doing it? Comment below!

Sources and Further Reading

Own your Code, Part 1: Git Hosting - How did we get here?

Somewhat recently, I posted about how I fixed a nasty problem with an lftp upload. I mentioned that I'd been setting up continuous deployment for an application that I've been writing.

There's actually quite a bit of a story behind how I got to that point, so I thought I'd post about it here. Starting with code hosting, I'm going to show how I setup my own private git server, followed by Laminar (which, I might add, is not for everyone. It's actually quite involved), and finally I'll take a look at continuous deployment.

The intention is to do so in a manner that enables you to do something similar for yourself too (If you have any questions along the way, comment below!).

Of course, this is far too much to stuff into a single blog post - so I'll be splitting it up into a little bit of a mini-series.

Personally, I use git for practically all the code I write, so it makes sense for me to use services such as GitLab and GitHub for hosting these in a public place so that others can find them.

This is all very well, but I do find that I've acquired a number of private projects (say, for University work) that I can't / don't want to open-source. In addition, I'd feel a lot better if I had a backup mirror of the important code repositories I host on 3rd party sites - just in case.

This is where hosting one's own git server comes into play. I've actually blogged about this before, but since then I've moved from Go Git Service to Gitea, a fork of Gogs though a (rather painful; also this) migration.

This post will be more of a commentary on how I went about it, whilst giving some direction on how to do it for yourself. Every server is very different, which makes giving concrete instructions challenging. In addition, I ended up with a seriously non-standard install procedure - which I can't recommend! I need to get around to straightening a few things out at some point.....

So without further hesitation, let's setup Gitea as our Git server! To do so, we'll need an Nginx web server setup already. If you haven't, try following this guide and then come back here.

DNS

Next, you'll need to point a new subdomain at your server that's going to be hosting your Git server. If you've already got a domain name pointed at it (e.g. with A / AAAA records), I can recommend using a CNAME record that points at this pre-existing domain name.

For example, if I have a pair of records for control.bobsrockets.com:

A       control.bobsrockets.com.    1.2.3.4
AAAA    control.bobsrockets.com.    2001::1234:5678

...I could create a symlink like this:

CNAME   git.bobsrockets.com         control.bobsrockets.com.

(Note: For the curious, this isn't actually official DNS record syntax. It's just pseudo-code I invented on-the-fly)

Installation

With that in place, the next order of business is actually installing Gitea. This is relatively simple, but a bit of a pain - because native packages (e.g. sudo apt install ....) aren't a thing yet.

Instead, you download a release binary from the releases page. Once done, we can do some setup to get all our ducks in a row. When setting it up myself, I ended up with a rather weird configuration - as I actually started with a Go Git Service instance before Gitea was a thing (and ended up going through a rather painful) - so you should follow their guide and have a 'normal' installation :P

Once done, you should have Gitea installed and the right directory structure setup.

A note here is that if you're like me and you have SSH running on a non-standard port, you've got 2 choices. Firstly, you can alter the SSH_PORT directive in the configuration file (which should be called app.ini) to match that of your SSH server.

If you decide that you want it to run it's own inbuilt SSH server on port 22 (or any port below 1024), what the guide doesn't tell you is that you need to explicitly give the gitea binary permission to listen on a privileged port. This is done like so:

setcap 'cap_net_bind_service=+ep' gitea

Note that every time you update Gitea, you'll have to re-run that command - so it's probably a good idea to store it in a shell script that you can re-execute at will.

At this point it might also be worth looking through the config file (app.ini I mentioned earlier). There's a great cheat sheet that details the settings that can be customised - some may be essential to configuring Gitea correctly for your environment and use-case.

Updates

Updates to Gitea are, of course, important. GitHub provides an Atom Feed that you can use to keep up-to-date with the latest releases.

Later on this series, we'll take a look at how we can automate the process by taking advantage of cron, Laminar CI, and fpm - amongst other tools. I haven't actually done this yet as of the time of typing and we've got a looong way to go until we get to that point - so it's a fair ways off.

Service please!

We've got Gitea installed and we've considered updates, so the natural next step is to configure it as a system service.

I've actually blogged about this process before, so if you're interested in the details, I recommend going and reading that article.

This is the service file I use:

[Unit]
Description=Gitea
After=syslog.target
After=rsyslog.service
After=network.target
#After=mysqld.service
#After=postgresql.service
#After=memcached.service
#After=redis.service

[Service]
# Modify these two values and uncomment them if you have
# repos with lots of files and get an HTTP error 500 because
# of that
###
#LimitMEMLOCK=infinity
#LimitNOFILE=65535
Type=simple
User=git
Group=git
WorkingDirectory=/srv/git/gitea
ExecStart=/srv/git/gitea/gitea web
Restart=always
Environment=USER=git HOME=/srv/git

[Install]
WantedBy=multi-user.target

I believe I took it from here when I migrated from Gogs to Gitea. Save this as /etc/systemd/system/gitea.service, and then do this:

sudo systemctl daemon-reload
sudo systemctl start gitea.service

This should start Gitea as a system service.

Wiring it up

The next step now that we've got Gitea running is to reverse-proxy it with Nginx that we set up earlier.

Create a new file at /etc/nginx/conf.d/2-git.conf, and paste in something like this (not forgetting to customise it to your own use-case):

server {
    listen  80;
    listen  [::]:80;

    server_name git.starbeamrainbowlabs.com;
    return 301 https://$host$request_uri;
}

upstream gitea {
    server  [::1]:3000;
    keepalive 4; # Keep 4 connections open as a cache
}   

server {
    listen  443 ssl http2;
    listen  [::]:443 ssl http2;

    server_name git.starbeamrainbowlabs.com;
    ssl_certificate     /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/privkey.pem;

    add_header strict-transport-security    "max-age=31536000;";
    add_header access-control-allow-origin  https://nextcloud.starbeamrainbowlabs.com   always;
    add_header content-security-policy      "frame-ancestors http://*.starbeamrainbowlabs.com";

    #index  index.html index.php;
    #root   /srv/www;

    location / {
        proxy_pass          http://gitea;

        #proxy_set_header   x-proxy-server      nginx;
        #proxy_set_header   host                $host;
        #proxy_set_header   x-originating-ip    $remote_addr;
        #proxy_set_header   x-forwarded-for     $remote_addr;

        proxy_hide_header   X-Frame-Options;
    }

    location ~ /.well-known {
        root    /srv/letsencrypt;
    }

    #include /etc/nginx/snippets/letsencrypt.conf;

    #location = / {
    #   proxy_pass          http://127.0.0.1:3000;
    #   proxy_set_header    x-proxy-server      nginx;
    #   proxy_set_header    host                $host;
    #   proxy_set_header    x-originating-ip    $remote_addr;
    #   proxy_set_header    x-forwarded-for     $remote_addr;
    #}

    #location = /favicon.ico {
    #   alias /srv/www/favicon.ico;
    #}
}

You may have to comment out the listen 443 blocks and put in a listen 80 temporarily whilst configuring letsencrypt.

Then, reload Nginx: sudo systemctl reload nginx

Conclusion

Phew! We've looked at installing and setting up Gitea behind Nginx, and using a systemd service to automate the management of Gitea.

I've also talked a bit about how I set my own Gitea instance up and why.

In future posts, I'm going to talk about Continuous Integration, and how I setup Laminar CI. I'll also talk about alternatives for those who want something that comes with a few more batteries included.... :P

Found this interesting? Got stuck and need help? Spotted a mistake? Comment below!

Summer Project Part 1: LoRaWAN Signal Mapping!

What? A new series (hopefully)! My final project for my Masters in Science course at University is taking place this summer, and on the suggestion of Rob Miles I'll be blogging about it along the way.

In this first post, I'd like to talk a little bit about the project I've picked and my initial thoughts.

As you have probably guessed from the title of this post, the project I've picked is on mapping LoRaWAN signal coverage. I'm specifically going to look at that of The Things Network, a public LoRaWAN network. I've actually posted about LoRa before, so I'd recommend you go back and read that post first before continuing with this one.

The plan I've drawn up so far is to build an Internet of Things device with an Arduino and an RFM95 (a LoRa modem chip) to collect a bunch of data, which I'll then push through some sort of AI to fill in the gaps.

The University have been kind enough to fund some of the parts I'll need, so I've managed to obtain some of them already. This mainly includes:

  • Some of the power management circuitry
  • An Arduino Uno
  • A bunch of wires
  • A breadboard
  • A 9V battery holder (though I suspect I'll need a different kind of battery that can be recharged)
  • Some switches

(Above: The parts that I've collected already. I've still got a bunch of parts to go though.)

I've ordered the more specialised parts through my University, and they should be arriving soon:

I'll also need a project box to keep it all in if I can't gain access to the University's 3D printers, but I'll tackle that later.

I'll store on a local microSD card for each message a random id and the location a message was sent. I'll transmit the location and the unique id via LoRaWAN, and the server will store it - along with the received signal strength from the different gateways that received the message.

Once a run is complete, I'll process the data and pair the local readings from the microSD card up with the ones the database has stored, so that we have readings from the 'block spots' where there isn't currently any coverage.

By using a unique random id instead of a timestamp, I can help preserve the privacy oft he person carrying the device. Of course, I can't actually ask anyone to carry the device around until I've received ethical approval from the University to do so. I've already submitted the form, and I'm waiting to hear back on that.

While I'm waiting, I'm starting to set up the backend application server. I've decided to write it in Node.js using SQLite to store the data, so that if I want to do multiple separate runs to compare coverage before and after a gateway is installed, I can do so easily by just moving on to a new SQLite database file.

In the next post, I might talk a little bit about how I'm planning on generating the random ids. I'd like to do some research into the built-in random() function and how ti compares to other unpredictable sources of randomness, such as comparing clocks.

Note to self: Don't reboot the server at midnight....

You may (or may not) have noticed a small window of ~3/4 hour the other day when my website was offline. I thought I'd post about the problem, the solution, and what I'll try to avoid next time.

The problem occurred when I was about to head to bed late at night. I decided to quickly reboot the server to reboot into a new kernel to activate some security updates.

I have this habit of leaving a ping -O hostname running in a separate terminal to monitor the progress of the reboot. I'm glad I did so this time, as I noticed that it took a while to go down for rebooting. Then it took an unusually long time to come up again, and when it did, I couldn't SSH in again!

After a quick check, the website was down too - so it was time to do something about it and fast. Thankfully, I already knew what was wrong - it was just a case of fixing it.....

In a Linux system, there's a file called /etc/fstab that defines all the file systems that are to be mounted. While this sounds a bit counter-intuitive (since how does it know to mount the filesystem that the file itself described how to mount?), it's built into the initial ramdisk (also this) if I understand it correctly.

There are many different types of file system in Linux. Common ones include ext4 (the latest Linux filesystem), nfs (Network FileSystem), sshfs (for mounting remote filesystems over SSH), davfs (WebDav shares), and more.

Problems start to arise when some of the filesystems defined in /etc/fstab don't mount correctly. WebDav filesystems are notorious for this, I've found - so they generally need to have the noauto flag attached, like this:

https://dav.bobsrockets.com/path/to/directory   /path/to/mount/point    davfs   noauto,user,rw,uid=1000,gid=1000    0   0

Unfortunately, I forgot to do this with the webdav filesystem I added a few weeks ago, causing the whole problem in the first place.

The unfortunate issue was that since it couldn't mount the filesystems, jt couldn't start the SSH server. If it couldn't start the SSH server, I couldn't get in to fix it!

Kimsufi rescue mode to the, erm rescue! It turned out that my provider, KimSufi, have a rescue mode system built-in for just this sort of occasion. At the click of a few buttons, I could reboot my server into a temporary rescue environment with a random SSH password.

Therein I could mount the OS file system, edit /etc/fstab, and reboot into normal mode. Sorted!

Just a note for future reference: I recommend using the rescuepro rescue mode OS, and not either of the FreeBSD options. I had issues trying to mount the OS disk with them - I kept getting an Invalid argumennt error. I was probably doing something wrong, but at the time I didn't really want to waste tones of time trying to figure that out in an unfamiliar OS.

Hopefully there isn't a next time. I'm certainly going to avoid auto webdav mounts, instead spawning a subprocess to mount them in the background after booting is complete.

I'm also going to avoid rebooting my server when I don't have time to deal with anyn potential fallout....

Responding to "the Internet is disintegrating"

The following post is my opinion on an article I've recently read and the issues it raises.

This isn't really my typical sort of post, but after I read this article on BBC Future recently I felt I had to post to set a few things straight.

The article talks about how authoritarian governments are increasingly controlling Internet traffic that's flowing through their countries. This part is true - China has their "great firewall", and several countries have controversial "off-switches" that they occasionally flip.

Internet protocols specify how all information must be addressed by your computer, in order to be transmitted and routed across the global wires; it’s a bit like how a Windows machine knows it can’t boot up an Apple operating system.

This is where it starts to derail. While Internet protocols such as HTTP and DNS do specify how different machines should talk to each other, it bears no resemblance to how a computer boots into it's operating system. In fact, it's perfectly possible to boot into macOS on a PC running Windows. It's important to distinguish between the hardware and the software running on it.

It also talks about "digital decider countries" being "scared" of an "open Internet".

“Nations like Zimbabwe and Djibouti, and Uganda, they don’t want to join an internet that’s just a gateway for Google and Facebook” to colonise their digital spaces, she says. Neither do these countries want to welcome this “openness” offered by the Western internet only to see their governments undermined by espionage.

Again, with this theme of not distinguishing between the hardware or network, and the software that's running on it.

An open Internet is not a space in which the likes of Facebook and Google hold total control. An open Internet is one in which people like you and me have choice. Choice of social network. Choice of email provider. Choice of news outlet.

An open Internet is defined by the choices we, as consumers, make. If everyone decides to use Facebook, then the Internet will then be subsequently dominated by Facebook. However, decentralised options do exist (also this) - and are increasing in popularity. Under a decentralised system, no 1 company has control over everyone's data, how it's processed, and who sees what.

While the obvious solution here is to simply 'block' misinformation and illegal content, it's not an easy one to implement, and it certainly comes with serious moral and ethical implications. How do we decide what is 'illegal' and what is 'legal'? How do we tell if a news article is 'real', or 'fake'? With over 10 million requests / second to CloudFlare alone, that's certainly far too much data flying around the Internet to handle manually.

Let's say we built an AI to detect what was legitimate, or implemented a set of rules (say, like a blacklist). What about satire? How do you decide out of the 1.3 billion web servers (not to mention that a single web server is likely to host several websites) which is legitimate, without potentially damaging vital competition to bigger businesses?

No algorithm is going to ever be 100% accurate. With this in mind, utilising such an algorithm would carry the terrible cost of limiting freedom of speech and communication. How can a government, which to my understanding is there to serve and represent the people, in good faith control and limit the freedom of expression of the people it is supposed to represent? There's a huge scope for abuse here, as has been clearly demonstrated by multiple countries around the world.

If blocking doesn't work, then how can we deal with misinformation online? Well, Mozilla seems to have some ideas. The solution is a collective effort: Everything from cross-verification of facts to linking to sources. When writing an article, it's imperative to link to appropriate sources to back up what you're saying. Cross-checking something you read in 1 article with another on a different website helps to ensure you have more sides of the story.

Although governments may claim that internet sovereignty protects its citizens from malware, many fear losing the freedom of the "open internet"

Malware is indeed another serious problem. Again the solution here is not "blocking" content, as malware authors will always find another way (also exhibits b, and c) to deliver their payload.

Again, the solution is in the hands of the people. Keeping systems up-to-date and making use of good password practices all help. Ensuring that device, networks, and software are all secure-by-design is great too (the website I was reading the article on doesn't even implement HTTPS correctly).

The article feels very one-sided. It makes no mention of the other alternative ways I've touched on above, or those tackling the challenges the article talks about in ways that don't harm freedom of expression. In fact, I'd say that it's more of an opinion piece (much like this post), but since it doesn't mention that it is as such, I feel it's rather deceptive.

...one thing is clear – the open internet that its early creators dreamed of is already gone.

Indeed, the early vision of the Internet has changed drastically as it's grown. However, your personal data belongs to you, so you've got the power to choose who processes it.

Sources and Further Reading

Using prefers-color-scheme to display alternate website themes

Support for the prefers-color-scheme landed in Firefox 67 beta recently. While it's a little bit awkward to use at the moment, it enables support for a crucial new feature on the web: Changing the colour scheme of a webpage based on the user's preference.

If a user is outside on a sunny day, they might prefer a light theme whilst browsing the web. However, if they are at home on an evening when it's dark, they might prefer a dark theme.

Traditionally, some websites have supported enabling a site-specific option in their settings, but this requires that the user enable an option manually when they visit the website - and has to be reset on every device and in every private browsing session.

The solution to this is the new prefers-color-scheme CSS level 5 media query. It lets you do something like this:

body {
    background: red; /* Fallback */
}

/* Dark theme */
@media (prefers-color-scheme: dark) {
    body { background: #333; color: #efefef; }
}
/* Light theme */
@media (prefers-color-scheme: light),
    (prefers-color-scheme: no-preference) {
    body { background: #efefef; color: #333; }
}

Hey presto - the theme will now auto-change based on the user's preference!

Here's a live demo of that in action:

See the Pen prefers-color-scheme demo by Starbeamrainbowlabs (@sbrl) on CodePen.

(Can't see the above? Try a direct link.)

Current Support

Currently, Firefox 67+ and Safari (wow, who knew?) support the new media query - as of the time of typing. For Firefox users, the preferred colour scheme is autodetected based on your operating system's colour scheme. Here's the caniuse.com support table, which should always be up-to-date:

If this doesn't work correctly for some reason (like it didn't for me on Ubuntu with the Unity Desktop - apparently it's based on the GTK theme on Ubuntu), then you can set it manually in about:config. Create a new integer value called ui.systemUsesDarkTheme, and set it to 1.

If you accidentally create a string by mistake, you'll need to close Firefox, find your profile folder, and edit prefs.js to force-change it, as about:config isn't capable of deleting or altering configuration directive types at this time (yep, I ran into this one!).

Making use of CSS Properties

Altering your website for dark-mode readers isn't actually all that hard. If you're using CSS Properties already, you might do something like this:

:root {
    --background: #efefef;
    --text-colour: #333;
}

@media (prefers-color-scheme: dark) {
    :root {
        --background: #333;
        --text-colour: #efefef;
    }
}

body {
    background: var(--background);
    color: var(--text-colour);
}

(Want a demo? Click here!)

This way, you can keep all the 'settings' that you might want to change later at the beginning of your CSS files, which makes it easy to add support for dark mode later.

I know that I'm going to be investigating adding support to my website at some point soon (stay tuned for that!).

Found this interesting? Got a cool resource to aid in colour scheme design? Found another great CSS feature? Comment below!

Sources and Further Reading

Powahroot: Client and Server-side routing in Javascript

The powahroot logo, which is a 16x16 pixel-art image and looks like a purple-red carrot with bright orange stripes and yellow light lines coming out of the sides

If I want to really understand something, I usually end up implementing it myself. This is the case with my latest library - powahroot, but also because I didn't really like the way any of the alternatives functioned because I'm picky.

Originally I wrote it for this project (although it's actually for a little satellite project that isn't open-source unfortunately - maybe at some point in the future!) - but I liked it so much that I decided that I had to turn it into a full library that I could share here.

In short, a routing framework helps you get requests handled in the right places in your application. I've actually blogged about this before, so I'd recommend you go and read that post first before continuing with this one.

For all the similarities between the server side (as mentioned in my earlier post) and the client side, the 2 environments are different enough that they warrant having 2 distinctly separate routers. In powahroot, I provide both a ServerRouter and a ClientRouter.

The ServerRouter is designed to handle Node.js HTTP request and response objects. It provides shortcut methods .get(), .post(), and others to quickly create routes for different request types - and also supports middleware to enable logical separation of authentication, request processing, and response generation.

The ClientRouter, on the other hand, is essentially a stripped-down version of the ServerRouter that's tailored to functioning in a browser environment. It doesn't support middleware (yet?), but it does support the pushstate that's part of the History API.

I've also published it on npm, so you can install it like this:

npm install --save powahroot

Then you can use it like this:

# On the server
import ServerRouter from 'powahroot/Server.mjs';

// ....

const router = new ServerRouter();
router.on_all(async (context, next) => { console.debug(context.url); await next()})
router.get("/files/::filepath", (context, _next) => context.send.plain(200, `You requested ${context.params.filepath}`));
// .....
# On the client
import ClientRouter from 'powahroot/Client.mjs';

// ....

const router = new ClientRouter({
    // Options object. Default settings:
    verbose: false, // Whether to be verbose in console.log() messages
    listen_pushstate: true, // Whether to react to browser pushstate events (excluding those generated by powahroot itself, because that would cause an infinite loop :P)
});

As you can see, powahroot uses ES6 Modules, which makes it easy to split up your code into separate independently-operating sections.

In addition, I've also generated some documentation with the documentation tool on npm. It details the API available to you, and should serve as a good reference when using the library.

You can find that here: https://starbeamrainbowlabs.com/code/powahroot/docs/

It's automatically updated via continuous integration and continuous deployment, which I really do need to get around to blogging about (I've spent a significant amount of time setting up the base system upon which powahroot's CI and CD works. In short I use Laminar CI and a GitHub Webhook, but there's a lot of complicated details).

Found this interesting? Used it in your own project? Got an idea to improve powahroot? Comment below!

shunction: Self-hosted Azure Functions!

It's not 1st April, but if it was - this post would be the perfect thing to release! It's about shunction, a parody project I've written. While it is a parody, hopefully it is of some use to someone :P

The other week, I discovered Azure Functions - thanks to Rob Miles. In the same moment, I thought that it would be fairly trivial to build a system by which you could self-host it - and shunction was born.

Built upon the lantern build engine, shunction lets you keep a folder of executable scripts and execute them at will. It supports 4 modes of operation:

  • adhoc - One-off runs
  • cron - Regularly scheduled execution
  • inotify - Execute when something changes on disk
  • http - Listen for HTTP requests

Of course, the system can be upgraded at a later date to support additional operating modes. The last 2 in the list start a persistent and long-running process, which will only exit if terminated.

Shunction is designed to run on Linux systems. Because of the shebang, any executable script (be it a shell script, a Python script, a Node.js script, or otherwise) is supported.

If you've got multiple tasks you want have your server running and want to keep them all in one place, then shunction would allow you to do so. For example, you could keep the task scripts in a git repository that you clone down onto the server, and then use shunction to execute them as needed.

How to quickly run TUI programs via SSH

Hello, and welcome to another blog post! I hope everyone had a lovely and restful Easter.

Very often, I want to run a command on a remote machine via SSH and leave it in a terminal in 1 corner of my screen whilst I work in another terminal on that same machine.

Up until now, I've always SSHed into the machine in question and then run the command manually:

user@local:~$ ssh bob@bobsrockets.com
# .....
bob@bobsrockets.com:~$ sudo htop

This is fine, but it takes a moment to connect & setup the terminal on the remote end. What if there was a way to specify the command to run remotely?

Well, it turns out there is. SSH lets you specify the command to run on the remote server instead of the default shell:

ssh sean@seanssatellites.io apt search beanstalk

Sadly, this doesn't always yield the results expected. Colour disappears from the output, and sometimes things like htop (ssh bill@billsboosters.co.uk htop) and sudo (ssh edgar@edsengineering.eu sudo apt update) break altogether:

Error opening terminal: unknown.

I can't remember how I figured it out, but I discovered that the issue is that when you specify the command instead of letting the default shell initialise, it treats it as some sort of 'script-mode', and doesn't allocate a pseudo-terminal on the remote machine.

Thankfully, there's a way to force it to allocate a pseudo-terminal. This is done with the -t flag:

ssh -t bob@bobsrockets.com sudo htop

This then enables interactive commands to work as intended, and causes colour to be displayed again :D

Found this useful? Got another great SSH tip? Comment below!

Art by Mythdael