Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems performance photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Setting up a Mosquitto MQTT server

I recently found myself setting up a mosquitto instance (yep, for this) due to a migration we're in the middle of doing and it got quite interesting, so I thought I'd post about it here. This post is also partly documentation of what I did and why, just in case future people come across it and wonder how it's setup, though I have tried to make it fairly self-documenting.

At first, I started by doing sudo apt install mosquitto and seeing if it would work. I can't remember if it did or not, but it certainly didn't after I played around with the configuration files. To this end, I decided that enough was enough and I turned the entire configuration upside-down. First up, I needed to disable the existing sysV init-based service that ships with the mosquitto package:

sudo systemctl stop mosquitto # Just in case
sudo systemctl start mosquitto

Next, I wrote a new systemd service file:

[Unit]

Description=Mosquitto MQTT Broker
After=syslog.target rsyslog.target network.target

[Service]
Type=simple
PIDFile=/var/run/mosquitto/mosquitto.pid
User=mosquitto

PermissionsStartOnly=true
ExecStartPre=-/bin/mkdir /run/mosquitto
ExecStartPre=/bin/chown -R mosquitto:mosquitto /run/mosquitto

ExecStart=/usr/sbin/mosquitto --config-file /etc/mosquitto/mosquitto.conf
ExecReload=/bin/kill -s HUP $MAINPID

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=mosquitto


[Install]
WantedBy=multi-user.target

This is broadly similar to the service file I developed in my earlier tutorial post, but it's slightly more complicated.

For one, I use PermissionsStartOnly=true and a series of ExecStartPre directives to allow mosquitto to create a PID file in a directory in /run. /run is a special directory on Linux for PID files and other such things, but normally only root can modify it. mosquitto will be running under the mosquitto user (surprise surprise), so we need to create a subdirectory for it and chown it so that it has write permissions.

A PID file is just a regular file on disk that contains the PID (Process IDentifier) number of the primary process of a system service. System service managers such as systemd and OpenRC use this number to manage the health of the service while it's running and send it various signals (such as to ask it to reload its configuration file).

With this in place, I then added an rsyslog definition at /etc/rsyslog.d/mosquitto.conf to tell it where to put the log files:

if $programname == 'kraggwapple' then /var/log/mosquitto/mosquitto.log
if $programname == 'kraggwapple' then stop

Thinking about it, I should probably check that a log rotation definition file is also in place.

Just in case, I then chowned the pre-existing log files to ensure that rsyslog could read & write to it:

sudo chown -R syslog: /var/log/mosquitto

Then, I filled out /etc/mosquitto/mosquitto.conf with a few extra directives and restarted the service. Here's the full configuration file:

# Place your local configuration in /etc/mosquitto/conf.d/
#
# A full description of the configuration file is at
# /usr/share/doc/mosquitto/examples/mosquitto.conf.example

# NOTE: We can't use tab characters here, as mosquitto doesn't like it.

pid_file /run/mosquitto/mosquitto.pid

# Persistence configuration
persistence true
persistence_location /var/lib/mosquitto/


# Not a file today, thanks
# Log files will actually end up at /var/llog/mosquitto/mosquitto.log, but will go via syslog
# See /etc/rsyslog.d/mosquitto.conf
#log_dest file /var/log/mosquitto/mosquitto.log
log_dest syslog


include_dir /etc/mosquitto/conf.d


# Documentation: https://mosquitto.org/man/mosquitto-conf-5.html

# Require a username / password to connect
allow_anonymous false
# ....which are stored in the following file
password_file /etc/mosquitto/mosquitto_users

# Make a log entry when a client connects & disconnects, to aid debugging
connection_messages true

# TLS configuration
# Disabled at the moment, since we don't yet have a letsencrypt cert
# NOTE: I don't think that the sensors currently connect over TLS. We should probably fix this.
# TODO: Point these at letsencrypt
#cafile /etc/mosquitto/certs/ca.crt
#certfile /etc/mosquitto/certs/hostname.localdomain.crt
#keyfile /etc/mosquitto/certs/hostname.localdomain.key

As you can tell, I've still got some work to do here - namely the TLS setup. It's a bit of a chicken-and-egg problem, because I need the domain name to be pointing at the MQTT server in order to get a Let's Encrypt TLS certificate, but that'll break all the sensors using the current one..... I'm sure I'll figure it out.

But wait! We forgot the user accounts. Before I started the new service, I added some user accounts for client applications to connect with:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users username1
sudo mosquitto_passwd /etc/mosquitto/mosquitto_users username1

The mosquitto_passwd program prompts for a password - that way you don't end up with the passwords in your ~/.bash_history file.

With all that taken care of, I started the systemd service:

sudo systemctl daemon-reload
sudo systemctl start mosquitto-broker.service

Of course, I ended up doing a considerable amount of debugging in between all this - I've edited it down to make it more readable and fit better in a blog post :P

Lastly, because I'm paranoid, I double-checked that it was running with htop and netstat:


sudo netstat -peanut | grep -i mosquitto
tcp        0      0 0.0.0.0:1883            0.0.0.0:*               LISTEN      112        2676558    5246/mosquitto      
tcp        0      0 x.y.z.w:1883           x.y.z.w:54657       ESTABLISHED 112        2870033    1234/mosquitto      
tcp        0      0 x.y.z.w:1883           x.y.z.w:39365       ESTABLISHED 112        2987984    1234/mosquitto      
tcp        0      0 x.y.z.w:1883           x.y.z.w:58428       ESTABLISHED 112        2999427    1234/mosquitto      
tcp6       0      0 :::1883                 :::*                    LISTEN      112        2676559    1234/mosquitto      

...no idea why it want to connect to itself, but hey! Whatever floats its boat.

Orange Pi 3 in review

An Orange Pi 3, along with it's logo. Of course, I'm not affiliated with the manufacturers in any way. In fact, they are probably not aware that this post even exists

I recently bought an Orange Pi 3 (based on the Allwinner H6 chipset) to perform a graphics-based task, and I've had an interesting enough time with it that I thought I'd share my experiences in a sort of review post here.

The first problem when it arrived was to find an operating system that supports it. My initial thought was to use Devuan, but I quickly realised that practically the only operating system that supports it at the moment is Armbian.

Not to be deterred, after a few false starts I got Armbian based on Ubuntu 18.04 Bionic Beaver installed. The next order of business was to install the software I wanted to use.

For the most part, I didn't have too much trouble with this - though it was definitely obvious that the arm64 (specifically sunxi64) architecture isn't a build target that's often supported by apt repository owners. This wasn't helped by the fact that apt has a habit of throw really weird error messages when you try to install something that exists in an apt repository, but for a different architecture.

After I got Kodi installed, the next order of business was to get it to display on the screen. I ended up managing this (eventually) with the help of a lot of tutorials and troubleshooting, but the experience was really rather unpleasant. I kept getting odd errors, like failed to load driver sun4i-drm when trying to start Kodi via an X11 server and other strangeness.

The trick in the end was to force X11 to use the fbdev driver, but I'm not entirely sure what that means or why it fixed the issue.

Moving on, I then started to explore the other capabilities of the device. Here, too, I discovered that a number of shortcomings in the software support provided by Linux, such as a lack of support for audio via HDMI and Bluetooth. I found the status matrix of the SunXI project, which is the community working to add support for the Allwinner H6 chipset to the Linux Kernel.

They do note that support for the H6 chipset is currently under development and is incomplete at the moment - and I wish I'd checked on software support before choosing a device to purchase.

The other big problem I encountered was a lack of kernel headers provided by Armbian. Normally, you can install the headers for your kernel by installing the linux-headers-XXXXXX package with your favourite package manager, where XXXXXX is the same as the string present in the linux-image-XXXXXX package you've got installed that contains the kernel itself.

This is actually kind of a problem, because it means that you can't compile any software that calls kernel functions yourself without the associated header files, preventing you from installing various dkms-based kernel modules that auto-recompile against the kernel you've got installed.

I ended up finding this forum thread, but the response who I assume is an armbian developer was less than stellar - they basically said that if you want kernel headers, you need to compile the kernel yourself! That's a significant undertaking, for those not in the know, and certainly not something that should be undertaken lightly.

While I've encountered a number of awkward issues that I haven't seen before, the device does have some good things worth noting. For one, it actually packs a pretty significant punch: it's much more powerful than a Raspberry Pi 3B+ (of which I have one; I bought this device before the Raspberry Pi 4 was released). This makes it an ideal choice for more demanding workloads, which a Raspberry Pi wouldn't quite be suitable for.

In conclusion, while it's a nice device, I can't recommend it to people just yet. Software support is definitely only half-baked at this point with some glaring holes (HDMI audio is one of them, which doesn't look like it's coming any time soon).

I think part of the problem is that Xunlong (that company that makes the device and others in it's family) don't appear to be interested in supporting the community at all, choosing instead to dump custom low-quality firmware for people to use as blobs of binary code (which apparently doesn't work) - which causes the SunXI community a lot of extra work to reverse-engineer it all and figure out how it all works before they can start implementing support in the Linux Kernel.

If you're interested in buying a similar embedded board, I can recommend instead using HackerBoards to find one that suits your needs. Don't forget to check for operating system support!

Found this interesting? Thinking of buying a board yourself? Had a different experience? Comment below!

The infrastructure behind Air Quality Web

For a while now, I've been working on Air-Quality-Web, a web interface that displays air quality information. While I haven't blogged about it directly before, a number of posts (a, b, c, d) I've made here have been indirectly related.

Since the air quality data has to come from somewhere, I thought I'd blog a little about the wider infrastructure behind the air quality web interface. My web interface is actually just 1 small part of a much wider stack of software that's being developed as a group by Connected Humber.

Said stack is actually quite distributed, so let's start with a diagram:

From left to right:

  • As a group we've designed a PCB (mainly thanks to @BNNorman) that acts as the base for sensor nodes themselves - though a number of people have built their own hardware.
  • Multiple different pieces of software run on top of the various pieces of hardware we've developed - some people use ESP Easy, and others use custom firmware they've implemented themselves.
  • Embedded devices send the data over WiFi to our MQTT broker (LoRaWAN via The Things Network is currently under development), which currently runs in a Debian Virtual Machine rented from a cloud infrastructure provider.
  • Another Debian VM hosts a database loading script, which listens for MQTT messages sent to the broker. It adds the data contained within into a database, which runs on the same box.
  • A final box hosts the web server, which simultaneously hosts the PHP-based HTTP API and the client-side web interface. Both of these are currently located in this repository, but later down the line I'd like to figure out how to decouple them into their own separate repositories.

We can represent the flow of data here in a flowchart, to get a better idea as to how it all fits together:

As you can see, there are many areas of the project that can be worked on independently of each other - depending on what people feel most comfortable working on. Personally, I stick mainly to the HTTP API and the main web interface with a hand in advising on database design, but there are lots of other ways to get involved if you so choose!

Sensors always need building, designing, and programming, and the data generated is available via the public HTTP API (the docs for which can be found here) - so anyone can write their own application on top of the data collected by our sensors. Want a light on your desk (or even your hat) that changes colour depending on your local air quality? Go ahead!

Found this interesting? Comment below!

Ensure your SSH server is secure with SSH Check

We've got ssllabs.com for testing HTTPS servers to ensure they are setup to be secure, and personally I've been using it for years now (psst, starbeamrainbowlabs.com gets an A+!).

SSH servers are a very different story, however. While I've blogged about them before, I mainly focused on preventing unauthorised access to a server by methods such as password cracking attacks.

Now that I'm coming to the end of my Msc in Security and Distributed Computing, however, I've realised there's a crucial element missing here: the security of the connection itself. HTTPS isn't the only one with complicated cipher suites that it supports that need correctly configuring.

The solution here is to check the SSH server in the same way that we do for a HTTPS web server. For this though we need a tool to do this for us and tell us what's good and what's not about our configuration - which is where SSH Check comes in.

I discovered it recently, and it pretends to connect to an SSH server to gauge it's configuration - after which it quickly disconnects before the remote server asks it for credentials to login.

A screenshot of a test of the example ssh server

Because SSH allows for every stage of the encryption process to be configured individually, SSH Check tests 4 main areas:

  • The key exchange algorithm (the algorithm used to exchange the secret key for symmetric encryption going forwards)
  • The algorithms used in the server's host SSH keys (the key whose ID is shown to you when you connect asking you if you want to continue)
  • The encryption algorithm (the symmetrical encryption algorithm used after key exchange)
  • The MAC algorithm (the Message Authentication Code algorithm - used to ensure integrity of messages)

It displays whether each algorithm is considered safe or not, and which ones are widely considered to be either deprecated or contain backdoors. In addition, it also displays the technical names of each one so that you can easily reconfigure your SSH server to disable unsafe algorithms, which is nice (good luck deciphering the SSL Labs encryption algorithms list and matching it up to the list already configured in your web server......).

It also presents a bunch of other interesting information too, which is nice. It identified a number of potential issues with the way that I had SSH setup for starbeamrainbowlabs.com along with some suggested improvements, which I've now fixed.

If you have a server that you access via SSH, I recommend checking it with SSH Check - especially if you expose SSH publicly over the Internet.

Found this interesting? Got another testing tool you'd like to share? Comment below!

Monitoring HTTP server response time with collectd and a bit of bash

In the spirit of the last few posts I've been making here (A and B), I'd like to talk a bit about collectd, which I use to monitor the status of my infrastructure. Currently this consists of the server you've connected to in order to view this webpage, and a Raspberry Pi that acts as a home file server.

I realised recently that monitoring the various services that I run (such as my personal git server for instance) would be a good idea, as I'd rather like to know when they go down or act abnormally.

As a first step towards this, I decided to configure my existing collectd setup to monitor the response time of the HTTP endpoints of these services. Later on, I can then configure some alerts to message me when something goes down.

My first thought was to check the plugin list to see if there was one that would do the trick. As you might have guessed by the title of this post, however, such an easy solution would be too uninteresting and not worthy of writing a blog post.

Since such a plugin doesn't (yet?) exist, I turned to the exec plugin instead.

In short, it lets you write a program that writes to the standard output in the collectd plain text protocol, which collectd then interprets and adds to whichever data storage backend you have configured.

Since shebangs are a thing on Linux, I could technically choose any language I have an interpreter installed for, but to keep things (relatively) simple, I chose Bash, the language your local terminal probably speaks (unless it speaks zsh or fish instead).

My priorities were to write a script that is:

  1. Easy to reconfigure
  2. Ultra lightweight

Bash supports associative arrays, so I can cover point #1 pretty easily like this:

declare -A targets=(
    ["main_website"]="https://starbeamrainbowlabs.com/"
    ["git"]="https://git.starbeamrainbowlabs.com/"
    # .....
)

Excellent! Covering point #2 will be an on-going process that I'll need to keep in mind as I write this script. I found this GitHub repository a while back, which has served as a great reference point in the past. Here's hoping it'll be useful this time too!

It's important to note the structure of the script that we're trying to write. Collectd exec scripts have 2 main environment variables we need to take notice of:

  • COLLECTD_HOSTNAME - The hostname of the local machine
  • COLLECTD_INTERVAL - Interval at which we should collect data. Defined in collectd.conf.

The script should write to the standard output the values we've collected in the collectd plain text format every COLLECTD_INTERVAL. Collectd will automatically ensure that only 1 instance of our script is running at once, and will also automatically restart it if it crashes.

To run a command regularly at a set interval, we probably want a while loop like this:

while :; do
    # Do our stuff here

    sleep "${COLLECTD_INTERVAL}";
done

This is a great start, but it isn't really compliant with objective #2 we defined above. sleep is actually a separate command that spawns a new process. That's an expensive operation, since it has to allocate memory for a new stack and create a new entry in the process table.

We can avoid this by abusing the read command timeout, like this:

# Pure-bash alternative to sleep.
# Source: https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever
snore() {
    local IFS;
    [[ -n "${_snore_fd:-}" ]] || exec {_snore_fd}<> <(:);
    read ${1:+-t "$1"} -u $_snore_fd || :;
}

Thanks to bolt for this.

Next, we need to iterate over the array of targets we defined above. We can do that with a for loop:

while :; do
    for target in "${!targets[@]}"; do
        check_target "${target}" "${targets[${target}]}"
    done

    snore "${COLLECTD_INTERVAL}";
done

Here we call a function check_target that will contain our main measurement logic. We've changed sleep to snore too - our new subprocess-less sleep alternative.

Note that we're calling check_target for each target one at a time. This is important for 2 reasons:

  • We don't want to potentially skew the results by taking multiple measurements at once (e.g. if we want to measure multiple PHP applications that sit in the same process poll, or measure more applications than we have CPUs)
  • It actually spawns a subprocess for each function invocation if we push them into the background with the & operator. As I've explained above, we want to try and avoid this to keep it lightweight.

Next, we need to figure out how to do the measuring. I'm going to do this with curl. First though, we need to setup the function and bring in the arguments:

# $1 - target name
# $2 - url
check_target() {
    local target_name="${1}"
    local url="${2}";

    # ......
}

Excellent. Now, let's use curl to do the measurement itself:

curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}" "${url}"

This looks complicated (and it probably is to some extent), but let's break it down with the help of explainshell.

Part Meaning
-sS Squashes all output except for errors and the bits we want. Great for scripts like ours.
--user-agent Specifies the user agent string to use when making a request. All good internet citizens should specify a descriptive one (more on this later).
-o /dev/null We're not interested in the content we download, so this sends it straight to the bin.
--max-time 5 This sets a timeout of 5 seconds for the whole operation - after which curl will throw an error and return with exit code 28.
-w "%{http_code}\n%{time_total}" This allows us to pull out metadata about the request we're interested in. There's actually a whole range available, but for now I'm interested in how long it took and the response code returned
"${url}" Specifies the URL to send the request to. curl does actually support making more than 1 request at once, but utilising this functionality is out-of-scope for now (and we'd get skewed results because it re-uses connections - which is normally really helpful & performance boosting)

To parse the output we get from curl, I found the readarray command after going a bit array mad at the beginning of this post. It pulls every line of input into a new slot in an array for us - and since we can control the delimiter between values with curl, it's perfect for parsing the output. Let's hook that up now:

readarray -t result < <(curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}" "${url}");

The weird command < <(another_command); syntax is process substitution. It's a bit like the another_command | command syntax, but a bit different. We need it here because readarray parses the values into a new array variable in the current context, and if we use the a | b syntax here, we instantly lose access to the variable it creates because a subprocess is spawned (and readarray is a bash builtin) - hence the weird process substitution.

Now that we've got the output from curl parsed and ready to go, we need to handle failures next. This is a little on the nasty side, as by default bash won't give us the non-zero exit code from substituted processes. Hence, we need to tweak our already long arcane incantation a bit more:

readarray -t result < <(curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}\n" "${url}"; echo "${PIPESTATUS[*]}");

Thanks to this answer on StackOverflow for ${PIPESTATUS}. Now, we have array called result with 3 elements in it:

Index Value
0 The HTTP response code
1 The time taken in seconds
2 The exit code of curl

With this information, we can now detect errors and abort continuing if we detect one. We know there was an error if any of the following occur:

  • curl returned a non-zero exit code
  • The HTTP response code isn't 2XX or 3XX

Let's implement that in bash:

if [[ "${result[2]}" -ne 0 ]] || [[ "${result[0]}" -lt "200" ]] || [[ "${result[0]}" -gt "399" ]]; then
    return
fi

Again, let's break it down:

  • [[ "${result[2]}" -ne 0 ]] - Detect a non-zero exit code from curl
  • [[ "${result[0]}" -lt "200" ]] - Detect if the HTTP response code is less than 200
  • [[ "${result[0]}" -gt "399" ]] - Detect if the HTTP response code is greater than 399

In the future, we probably want to output a notification here of some sort instead of just simply silently returning, but for now it's fine.

Finally, we can now output the result in the right format for collectd to consume. Collectd operates on identifiers, values, and intervals. A bit of head-scratching and documentation reading later, and I determined the correct identifier format for the task. I wanted to have all the readings on the same graph so I could compare the different response times (just like the ping plugin does), so we want something like this:

bobsrockets.com/http_services/response_time-TARGET_NAME`

....where we replace bobsrockets.com with ${COLLECTD_HOSTNAME}, and TARGET_NAME with the name of the target we're measuring (${target_name} from above).

We can do this like so:

echo "PUTVAL \"${COLLECTD_HOSTNAME}/http_services/response_time-${target_name}\" interval=${COLLECTD_I
NTERVAL} N:${result[1]}";

Here's an example of it in action:

PUTVAL "/http_services/response_time-git" interval=300.000 N:0.118283
PUTVAL "/http_services/response_time-main_website" interval=300.000 N:0.112073

It does seem to run through the items in the array in a rather strange order, but so long as it does iterate the whole lot, I don't really care.

I'll include the full script at the bottom of this post, so all that's left to do is to point collectd at our new script like this in /etc/collectd.conf:

LoadPlugin  exec

# .....

<Plugin exec>
    Exec    "nobody:nogroup"        "/etc/collectd/http_response_times.sh"  "measure"
</Plugin>

I've added measure as an argument there for future-proofing, as it looks like we may have to run a separate instance of the script for sending notifications if I understand the documentation correctly (I need to do some research.....).

Very cool. It's taken a few clever tricks, but we've managed to write an efficient script for measuring http response times. We've made it more efficient by exploiting read timeouts and other such things. While we won't gain a huge amount of speed from this (bash is pretty lightweight already - this script is weighing in at just ~3.64MiB of private RAM O.o), it will all add up over time - especially considering how often this will be running.

In the future, I'll definitely want to take a look at implementing some alerts to notify me if a service is down - but that will be a separate post, as this one is getting quite long :P

Found this interesting? Got another way of doing this? Curious about something? Comment below!


Full Script

#!/usr/bin/env bash
set -o pipefail;

# Variables:
#   COLLECTD_INTERVAL   Interval at which to collect data
#   COLLECTD_HOSTNAME   The hostname of the local machine

declare -A targets=(
    ["main_website"]="https://starbeamrainbowlabs.com/"
    ["webmail"]="https://mail.starbeamrainbowlabs.com/"
    ["git"]="https://git.starbeamrainbowlabs.com/"
    ["nextcloud"]="https://nextcloud.starbeamrainbowlabs.com/"
)
# These are only done once, so external commands are ok
version="0.1+$(date +%Y%m%d -r $(readlink -f "${0}"))";

user_agent="HttpResponseTimeMeasurer/${version} (Collectd Exec Plugin; $(uname -sm)) bash/${BASH_VERSION} curl/$(curl --version | head -n1 | cut -f2 -d' ')";

# echo "${user_agent}"

###############################################################################

# Pure-bash alternative to sleep.
# Source: https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever
snore() {
    local IFS;
    [[ -n "${_snore_fd:-}" ]] || exec {_snore_fd}<> <(:);
    read ${1:+-t "$1"} -u $_snore_fd || :;
}

# Source: https://github.com/dylanaraps/pure-bash-bible#split-a-string-on-a-delimiter
split() {
    # Usage: split "string" "delimiter"
    IFS=$'\n' read -d "" -ra arr <<< "${1//$2/$'\n'}"
    printf '%s\n' "${arr[@]}"
}

# Source: https://github.com/dylanaraps/pure-bash-bible#get-the-number-of-lines-in-a-file
# Altered to operate on the standard input.
count_lines() {
    # Usage: lines <"file"
    mapfile -tn 0 lines
    printf '%s\n' "${#lines[@]}"
}

###############################################################################

# $1 - target name
# $2 - url
check_target() {
    local target_name="${1}"
    local url="${2}";

    readarray -t result < <(curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}\n" "${url}"; echo "${PIPESTATUS[*]}");

    # 0 - http response code
    # 1 - time taken
    # 2 - curl exit code

    # Make sure the exit code is non-zero - this includes if curl hits a timeout error
    # Also ensure that the HTTP response code is valid - any 2xx or 3xx response code is ok
    if [[ "${result[2]}" -ne 0 ]] || [[ "${result[0]}" -lt "200" ]] || [[ "${result[0]}" -gt "399" ]]; then
        return
    fi

    echo "PUTVAL \"${COLLECTD_HOSTNAME}/http_services/response_time-${target_name}\" interval=${COLLECTD_INTERVAL} N:${result[1]}";
}

while :; do
    for target in "${!targets[@]}"; do
        # NOTE: We don't use concurrency here because that spawns additional subprocesses, which we want to try & avoid. Even though it looks slower, it's actually more efficient (and we don't potentially skew the results by measuring multiple things at once)
        check_target "${target}" "${targets[${target}]}"
    done

    snore "${COLLECTD_INTERVAL}";
done

Own your code, Part 2: The curious case of the unreliable webhook

In the last post, I talked about how to setup your own Git server with Gitea. In this one, I'm going to take bit of a different tack - and talk about one of the really annoying problems I ran into when setting up my continuous integration server, Laminar CI.

Since I wanted to run the continuous integration server on a different machine to the Gitea server itself, I needed a way for the Gitea server to talk to the CI server. The natural choice here is, of course, a Webhook-based system.

After installing and configuring Webhook on the CI server, I set to work writing a webhook receiver shell script (more on this in a future post!). Unfortunately, it turned out that that Gitea didn't like sending to my CI server very much:

A ton of failed attempts at sending a webhook to the CI server

Whether it succeeded or not was random. If I hit the "Test Delivery" button enough times, it would eventually go through. My first thought was to bring up the Gitea server logs to see if it would give any additional information. It claimed that there was an i/o timeout communicating with the CI server:

Delivery: Post https://ci.bobsrockets.com/hooks/laminar-config-check: read tcp 5.196.73.75:54504->x.y.z.w:443: i/o timeout

Interesting, but not particularly helpful. If that's the case, then I should be able to get the same error with curl on the Gitea server, right?

curl https://ci.bobsrockets.com/hooks/testhook

.....wrong. It worked flawlessly. Every time.

Not to be beaten by such an annoying issue, I moved on to my next suspicion. Since my CI server is unfortunately behind NAT, I checked the NAT rules on the router in front of it to ensure that it was being exposed correctly.

Unfortunately, I couldn't find anything wrong here either! By this point, it was starting to get really rather odd. As a sanity check, I decided to check the server logs on the CI server, since I'm running Webhook behind Nginx (as a reverse-proxy):

5.196.73.75 - - [04/Dec/2018:20:48:05 +0000] "POST /hooks/laminar-config-check HTTP/1.1" 408 0 "-" "GiteaServer"

Now that's weird. Nginx has recorded a HTTP 408 error. Looking is up reveals that it's a Request Timeout error, which has the following definition:

The server did not receive a complete request message within the time that it was prepared to wait.

Wait what? Sounds to me like there's an argument going on between the 2 servers here - in which each server is claiming that the other didn't send a complete request or response.

At this point, I blamed this on a faulty HTTP implementation in Gitea, and opened an issue.

As a workaround, I ended up configuring Laminar to use a Unix socket on disk (as opposed to an abstract socket), forwarding it over SSH, and using a git hook to interact with it instead (more on how I managed this in a future post. There's a ton of shell scripting that I need to talk about first).

This isn't the end of this tail though! A month or two after I opened the issue, I wound up in the situation whereby I wanted to connect a GitHub repository to my CI server. Since I don't have shell access on github.com, I had to use the webhook.

When I did though, I got a nasty shock: The webhook deliveries exhibited the exact same random failures as I saw with the Gitea webhook. If I'd verified the Webhook server and cleared Gitea's HTTP implementation's name, then what else could be causing the problem?

At this point, I can only begin to speculate what the issue is. Personally, I suspect that it's a bug in the port-forwarding logic of my router, whereby it drops the first packet from a new IP address while it sets up a new NAT session to forward the packets to the CI server or something - so subsequent requests will go through fine, so long as they are sent within the NAT session timeout and from the same IP. If you've got a better idea, please comment below!

Of course, I really wanted to get the GitHub repository connected to my CI server, and if the only way I could do this was with a webhook, it was time for some request-wrangling.

My solution: A PHP proxy script running on the same server as the Gitea server (since it has a PHP-enabled web server set up already). If said script eats the request and emits a 202 Accepted immediately, then it can continue trying to get a hold of the webhook on the CI server 'till the cows come home - and GitHub will never know! Genius.

PHP-FPM (the fastcgi process manager; great alongside Nginx) makes this possible with the fastcgi_finish_request() method, which both flushes the buffer and ends the request to the client, but doesn't kill the PHP script - allowing for further processing to take place without the client having to wait.

Extreme caution must be taken with this approach however, as it can easily lead to a situation where the all the PHP-FPM processes are busy waiting on replies from the CI server, leaving no room for other requests to be fulfilled and a big messy pile-up in the queue forming behind them.

Warnings aside, here's what I came up with:

<?php

$settings = [
    "target_url" => "https://ci.bobsrockets.com/hooks/laminar-git-repo",
    "response_message" => "Processing laminar job proxy request.",
    "retries" => 3,
    "attempt_timeout" => 2 // in seconds, for a single attempt
];

$headers = "host: ci.starbeamrainbowlabs.com\r\n";
foreach(getallheaders() as $key => $value) {
    if(strtolower($key) == "host") continue;
    $headers .= "$key: $value\r\n";
}
$headers .= "\r\n";

$request_content = file_get_contents("php://input");

// --------------------------------------------

http_response_code(202);
header("content-type: text/plain");
header("content-length: " . strlen($settings["response_message"]));
echo($settings["response_message"]);

fastcgi_finish_request();

// --------------------------------------------

function log_message($msg) {
    file_put_contents("ci-requests.log", $msg, FILE_APPEND);
}

for($i = 0; $i < $settings["retries"]; $i++) {
    $start = microtime(true);

    $context = stream_context_create([
        "http" => [
            "header" => $headers,
            "method" => "POST",
            "content" => $request_content,
            "timeout" => $settings["attempt_timeout"]
        ]
    ]);

    $result = file_get_contents($settings["target_url"], false, $context);

    if($result !== false) {
        log_message("[" . date("r") . "] Queued laminar job in " . (microtime(true) - $start_time)*1000 . "ms");
        break;
    }


    log_message("[" . date("r") . "] Failed to laminar job after " . (microtime(true) - $start_time)*1000 . "ms.");
}

I've named it autowrangler.php. A few things of note here:

  • php://input is a special virtual file that's mapped internally by PHP to the client's request. By eating it with file_get_contents(), we can get the entire request body that the client has sent to us, so that we can forward it on to the CI server.
  • getallheaders() lets us get a hold of all the headers sent to us by the client for later forwarding
  • I use log_message() to keep a log of the successes and failures in a log file. So far I've got a ~32% failure rate, but never more than 1 failure in a row - giving some credit to my earlier theory I talked about above.

This ends the tale of the recalcitrant and unreliable webhook. Hopefully you've found this an interesting read. In future posts, I want to look at how I configured Webhook, the inner workings of the git hook I mentioned above, and the collection of shell scripts I've cooked to that make my CI server tick in a way that makes it easy to add new projects quickly.

Found this interesting? Run into this issue yourself? Found a better solution workaround? Comment below!

Own your Code, Part 1: Git Hosting - How did we get here?

Somewhat recently, I posted about how I fixed a nasty problem with an lftp upload. I mentioned that I'd been setting up continuous deployment for an application that I've been writing.

There's actually quite a bit of a story behind how I got to that point, so I thought I'd post about it here. Starting with code hosting, I'm going to show how I setup my own private git server, followed by Laminar (which, I might add, is not for everyone. It's actually quite involved), and finally I'll take a look at continuous deployment.

The intention is to do so in a manner that enables you to do something similar for yourself too (If you have any questions along the way, comment below!).

Of course, this is far too much to stuff into a single blog post - so I'll be splitting it up into a little bit of a mini-series.

Personally, I use git for practically all the code I write, so it makes sense for me to use services such as GitLab and GitHub for hosting these in a public place so that others can find them.

This is all very well, but I do find that I've acquired a number of private projects (say, for University work) that I can't / don't want to open-source. In addition, I'd feel a lot better if I had a backup mirror of the important code repositories I host on 3rd party sites - just in case.

This is where hosting one's own git server comes into play. I've actually blogged about this before, but since then I've moved from Go Git Service to Gitea, a fork of Gogs though a (rather painful; also this) migration.

This post will be more of a commentary on how I went about it, whilst giving some direction on how to do it for yourself. Every server is very different, which makes giving concrete instructions challenging. In addition, I ended up with a seriously non-standard install procedure - which I can't recommend! I need to get around to straightening a few things out at some point.....

So without further hesitation, let's setup Gitea as our Git server! To do so, we'll need an Nginx web server setup already. If you haven't, try following this guide and then come back here.

DNS

Next, you'll need to point a new subdomain at your server that's going to be hosting your Git server. If you've already got a domain name pointed at it (e.g. with A / AAAA records), I can recommend using a CNAME record that points at this pre-existing domain name.

For example, if I have a pair of records for control.bobsrockets.com:

A       control.bobsrockets.com.    1.2.3.4
AAAA    control.bobsrockets.com.    2001::1234:5678

...I could create a symlink like this:

CNAME   git.bobsrockets.com         control.bobsrockets.com.

(Note: For the curious, this isn't actually official DNS record syntax. It's just pseudo-code I invented on-the-fly)

Installation

With that in place, the next order of business is actually installing Gitea. This is relatively simple, but a bit of a pain - because native packages (e.g. sudo apt install ....) aren't a thing yet.

Instead, you download a release binary from the releases page. Once done, we can do some setup to get all our ducks in a row. When setting it up myself, I ended up with a rather weird configuration - as I actually started with a Go Git Service instance before Gitea was a thing (and ended up going through a rather painful) - so you should follow their guide and have a 'normal' installation :P

Once done, you should have Gitea installed and the right directory structure setup.

A note here is that if you're like me and you have SSH running on a non-standard port, you've got 2 choices. Firstly, you can alter the SSH_PORT directive in the configuration file (which should be called app.ini) to match that of your SSH server.

If you decide that you want it to run it's own inbuilt SSH server on port 22 (or any port below 1024), what the guide doesn't tell you is that you need to explicitly give the gitea binary permission to listen on a privileged port. This is done like so:

setcap 'cap_net_bind_service=+ep' gitea

Note that every time you update Gitea, you'll have to re-run that command - so it's probably a good idea to store it in a shell script that you can re-execute at will.

At this point it might also be worth looking through the config file (app.ini I mentioned earlier). There's a great cheat sheet that details the settings that can be customised - some may be essential to configuring Gitea correctly for your environment and use-case.

Updates

Updates to Gitea are, of course, important. GitHub provides an Atom Feed that you can use to keep up-to-date with the latest releases.

Later on this series, we'll take a look at how we can automate the process by taking advantage of cron, Laminar CI, and fpm - amongst other tools. I haven't actually done this yet as of the time of typing and we've got a looong way to go until we get to that point - so it's a fair ways off.

Service please!

We've got Gitea installed and we've considered updates, so the natural next step is to configure it as a system service.

I've actually blogged about this process before, so if you're interested in the details, I recommend going and reading that article.

This is the service file I use:

[Unit]
Description=Gitea
After=syslog.target
After=rsyslog.service
After=network.target
#After=mysqld.service
#After=postgresql.service
#After=memcached.service
#After=redis.service

[Service]
# Modify these two values and uncomment them if you have
# repos with lots of files and get an HTTP error 500 because
# of that
###
#LimitMEMLOCK=infinity
#LimitNOFILE=65535
Type=simple
User=git
Group=git
WorkingDirectory=/srv/git/gitea
ExecStart=/srv/git/gitea/gitea web
Restart=always
Environment=USER=git HOME=/srv/git

[Install]
WantedBy=multi-user.target

I believe I took it from here when I migrated from Gogs to Gitea. Save this as /etc/systemd/system/gitea.service, and then do this:

sudo systemctl daemon-reload
sudo systemctl start gitea.service

This should start Gitea as a system service.

Wiring it up

The next step now that we've got Gitea running is to reverse-proxy it with Nginx that we set up earlier.

Create a new file at /etc/nginx/conf.d/2-git.conf, and paste in something like this (not forgetting to customise it to your own use-case):

server {
    listen  80;
    listen  [::]:80;

    server_name git.starbeamrainbowlabs.com;
    return 301 https://$host$request_uri;
}

upstream gitea {
    server  [::1]:3000;
    keepalive 4; # Keep 4 connections open as a cache
}   

server {
    listen  443 ssl http2;
    listen  [::]:443 ssl http2;

    server_name git.starbeamrainbowlabs.com;
    ssl_certificate     /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/privkey.pem;

    add_header strict-transport-security    "max-age=31536000;";
    add_header access-control-allow-origin  https://nextcloud.starbeamrainbowlabs.com   always;
    add_header content-security-policy      "frame-ancestors http://*.starbeamrainbowlabs.com";

    #index  index.html index.php;
    #root   /srv/www;

    location / {
        proxy_pass          http://gitea;

        #proxy_set_header   x-proxy-server      nginx;
        #proxy_set_header   host                $host;
        #proxy_set_header   x-originating-ip    $remote_addr;
        #proxy_set_header   x-forwarded-for     $remote_addr;

        proxy_hide_header   X-Frame-Options;
    }

    location ~ /.well-known {
        root    /srv/letsencrypt;
    }

    #include /etc/nginx/snippets/letsencrypt.conf;

    #location = / {
    #   proxy_pass          http://127.0.0.1:3000;
    #   proxy_set_header    x-proxy-server      nginx;
    #   proxy_set_header    host                $host;
    #   proxy_set_header    x-originating-ip    $remote_addr;
    #   proxy_set_header    x-forwarded-for     $remote_addr;
    #}

    #location = /favicon.ico {
    #   alias /srv/www/favicon.ico;
    #}
}

You may have to comment out the listen 443 blocks and put in a listen 80 temporarily whilst configuring letsencrypt.

Then, reload Nginx: sudo systemctl reload nginx

Conclusion

Phew! We've looked at installing and setting up Gitea behind Nginx, and using a systemd service to automate the management of Gitea.

I've also talked a bit about how I set my own Gitea instance up and why.

In future posts, I'm going to talk about Continuous Integration, and how I setup Laminar CI. I'll also talk about alternatives for those who want something that comes with a few more batteries included.... :P

Found this interesting? Got stuck and need help? Spotted a mistake? Comment below!

shunction: Self-hosted Azure Functions!

It's not 1st April, but if it was - this post would be the perfect thing to release! It's about shunction, a parody project I've written. While it is a parody, hopefully it is of some use to someone :P

The other week, I discovered Azure Functions - thanks to Rob Miles. In the same moment, I thought that it would be fairly trivial to build a system by which you could self-host it - and shunction was born.

Built upon the lantern build engine, shunction lets you keep a folder of executable scripts and execute them at will. It supports 4 modes of operation:

  • adhoc - One-off runs
  • cron - Regularly scheduled execution
  • inotify - Execute when something changes on disk
  • http - Listen for HTTP requests

Of course, the system can be upgraded at a later date to support additional operating modes. The last 2 in the list start a persistent and long-running process, which will only exit if terminated.

Shunction is designed to run on Linux systems. Because of the shebang, any executable script (be it a shell script, a Python script, a Node.js script, or otherwise) is supported.

If you've got multiple tasks you want have your server running and want to keep them all in one place, then shunction would allow you to do so. For example, you could keep the task scripts in a git repository that you clone down onto the server, and then use shunction to execute them as needed.

How to quickly run TUI programs via SSH

Hello, and welcome to another blog post! I hope everyone had a lovely and restful Easter.

Very often, I want to run a command on a remote machine via SSH and leave it in a terminal in 1 corner of my screen whilst I work in another terminal on that same machine.

Up until now, I've always SSHed into the machine in question and then run the command manually:

user@local:~$ ssh bob@bobsrockets.com
# .....
bob@bobsrockets.com:~$ sudo htop

This is fine, but it takes a moment to connect & setup the terminal on the remote end. What if there was a way to specify the command to run remotely?

Well, it turns out there is. SSH lets you specify the command to run on the remote server instead of the default shell:

ssh sean@seanssatellites.io apt search beanstalk

Sadly, this doesn't always yield the results expected. Colour disappears from the output, and sometimes things like htop (ssh bill@billsboosters.co.uk htop) and sudo (ssh edgar@edsengineering.eu sudo apt update) break altogether:

Error opening terminal: unknown.

I can't remember how I figured it out, but I discovered that the issue is that when you specify the command instead of letting the default shell initialise, it treats it as some sort of 'script-mode', and doesn't allocate a pseudo-terminal on the remote machine.

Thankfully, there's a way to force it to allocate a pseudo-terminal. This is done with the -t flag:

ssh -t bob@bobsrockets.com sudo htop

This then enables interactive commands to work as intended, and causes colour to be displayed again :D

Found this useful? Got another great SSH tip? Comment below!

Fixing recursive uploads with lftp: The tale of the rogue symbolic link

I've been setting up continuous deployment recently for an application I'm working on, and as part of this process I'm uploading the release with sftp, using a restricted user account that is both chrooted (though I use a subfolder of the home directory to be extra-sure) and doesn't have shell access.

Since the application is written in PHP, I use composer to manage the server-side PHP library dependencies - which works very well. The problems start when I try to upload the whole thing to the server - so I thought I'd make a quick post here on how I fixed it.

In a previous build step, I generate an archive for the release, and put it in the continuous integration (CI) archive folder.

In the deployment phase, it unpacks this compressed archive and then uploads it to the production server with lftp, because I need to do some fiddling about that I can't do with regular sftp (anyone up for a tutorial on this? I'd be happy to write a few posts on this). However, I kept getting this weird error in the CI logs:

lftp: MirrorJob.cc:242: void MirrorJob::JobFinished(Job*): Assertion `transfer_count>0' failed.
./lantern-build-engine/lantern.sh: line 173:  5325 Aborted                 $command_name $@

Very strange indeed! Apparently, lftp isn't known for outputting especially useful error messages when used in an automated script like this. I tried everything. I rewrote, refactored, and completely turned the whole thing upside-down multiple times. This, as you might have guessed, took quite a while.

Commits aside, it was only when I refactored it to do the upload via the regular sftp command like this that it became apparent what the problem was:

sftp -i "${SSH_KEY_PATH}" -P "${deploy_ssh_port}" -o PasswordAuthentication=no "${deploy_ssh_user}@${deploy_ssh_host}" << SFTPCOMMANDS
mkdir ${deploy_root_dir}/www-new
put -r ${source_upload_dir}/* ${deploy_root_dir}/www-new
bye
SFTPCOMMANDS

Thankfully, sftp outputs much more helpful error messages. I saw this in the CI logs:

.....
Entering /tmp/tmp.ssR3j7vGhC-air-quality-upload//vendor/nikic/php-parser/bin
Entering /tmp/tmp.ssR3j7vGhC-air-quality-upload//vendor/bin
php-parse: not a regular file

The last line there instantly told me what I needed to know: It was failing to upload a symbolic link.

The solution here was simple: Unwind the symbolic links into hard links instead, and then I'll still get the benefit of a link on the local disk, but sftp will treat it as a regular file and upload a duplicate.

This is done like so:

find "${temp_dir}" -type l -exec bash -c 'ln -f "$(readlink -m "$0")" "$0"' {} \;

Thanks to SuperUser for the above (though I would have expected to find it on the Unix Stack Exchange).

If you'd like to see the full deployment script I've written, you can find it here.

There's actually quite a bit of context to how I ended up encountering this problem in the first place - which includes things like CI servers, no small amount of bash scripting, git servers, and remote deployment.

In the future, I'd like to make a few posts about the exploration I've been doing in these areas - perhaps along the lines of "how did we get here?", as I think they'd make for interesting reading.....

Art by Mythdael