Archive

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Users and access control in the Mosquitto MQTT server

A while ago, I blogged about how to setup an MQTT server with Mosquitto. In this one, I want to talk about how to setup multiple user accounts and how to implement access control.

In this post, I'll assume that you've already followed my previous post to which I've linked above.

User accounts

User accounts are a great security measure, as they prevent anyone without a password from accessing your MQTT server. Thankfully, they are pretty easy to do too - you just need a user / password file, and a directive in the main mosquitto.conf file to get it to read from it.

First, let's create a new users file:

sudo touch /etc/mosquitto/mosquitto_users
sudo chown mosquitto:mosquitto /etc/mosquitto/mosquitto_users
sudo chmod 0640 /etc/mosquitto/mosquitto_users

Then you can create new users like this:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users new_username_1

...replacing new_username_1 with the username of the new account you want to create. Upon executing the above, it will prompt you to enter a new password. Personally I use Keepass2 for this purpose, but you can create good passwords on the command line directly too:

dd if=/dev/urandom bs=1 count=20 | base64 | tr -d '+/='

Now that we have a users file, we can tell mosquitto about it. Add the following to your /etc/mosquitto/mosquitto.conf file:

# Require a username / password to connect
allow_anonymous false
# ....which are stored in the following file
password_file /etc/mosquitto/mosquitto_users

This disables anonymous access, and tells mosquitto where the the username / password file.

In future if you want to delete a user, do that like this:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users -D new_username_1

Access control

Access control is similar to user accounts. First, we need an access control file - which describes who can access what - and then we need a directive in the mosquitto.conf file to tell Mosquitto about it. Let's start with that access control file. Mine is located at /etc/mosquitto/mosquitto_acls.

# Directives here affect anonymous users, but we've disabled anonymous access

user bob
topic read rockets/status

There are 2 parts to the ACL file. First, the user directive sets the current user for which any following topic directives apply.

The topic directive allows the current user to read, write, or readwrite (both at the same time) a given topic. MQTT as a protocol is built on the idea of publishing (writing) to or subscribing (reading from) topics. Mosquitto assumes that a user has no access at all unless 1 or more topic directives are present to allow access.

The topic directive is comprised of 3 parts. First, the word topic is the name of the directive.

Next, any 1 of the following words declares what kind of access is being granted:

• read: Read-only access
• write: Write-only access
• readwrite: Both read and write access

Finally, the name of the topic that is being affected by the access rule is given. This may include a hash symbol (#) as a wildcard. For example, rockets/status would affect only that specific topic, but space/# would affect all topics that start with space/.

Here are some more examples:

# Allow read access to "my_app/news"

topic write rockets/status

topic readwrite another_app/#

Once you've created your ACL file, add this to your mosquitto.conf (being careful to put it before any listener directives if you have TLS / MQTTS support enabled):

acl_file /etc/mosquitto/mosquitto_acls

After making changes above, you'll want to tell Mosquitto to reload the configuration file. Do that like this:

sudo systemctl reload mosquitto-mqtt.service

If your systemd service file doesn't support reloading, then a restart will do. Alternatively, add this to your systemd service file to the [Service] section:

if $programname == 'gossa' then stop After that, I configured log rotate by putting this into /etc/logrotate.d/gossa: /var/log/gossa/*.log { daily missingok rotate 14 compress delaycompress notifempty create 0640 root adm postrotate invoke-rc.d rsyslog rotate >/dev/null endscript } Very similar to the configuration I used for RhinoReminds, which I blogged about here. Lastly, I configured Nginx on the machine I'm running this on to reverse-proxy to Gossa: server { # .... location /gossa { proxy_pass http://[::1]:5700; } # .... } I've configured authentication elsewhere in my Nginx server block to protect my installation against unauthorised access (and oyu probably should too). All that's left to do is start Gossa and reload Nginx: sudo systemctl daemon-reload sudo systemctl start gossa # Check that Gossa is running sudo systemctl status gossa # Test the Nginx configuration file changes before reloading it sudo nginx -t sudo systemctl reload Note that reloading Nginx is more efficient that restarting it, since it doesn't kill the process - only reload the configuration from disk. It doesn't matter here, but in a production environment that receives a high volume of traffic you it's a great way make configuration changes while avoid dropping client connections. In your web browser, you should see something like the image at the top of this post. Found this interesting? Got another quick solution to an otherwise awkward issue? Comment below! Setting up a Mosquitto MQTT server I recently found myself setting up a mosquitto instance (yep, for this) due to a migration we're in the middle of doing and it got quite interesting, so I thought I'd post about it here. This post is also partly documentation of what I did and why, just in case future people come across it and wonder how it's setup, though I have tried to make it fairly self-documenting. At first, I started by doing sudo apt install mosquitto and seeing if it would work. I can't remember if it did or not, but it certainly didn't after I played around with the configuration files. To this end, I decided that enough was enough and I turned the entire configuration upside-down. First up, I needed to disable the existing sysV init-based service that ships with the mosquitto package: sudo systemctl stop mosquitto # Just in case sudo systemctl start mosquitto Next, I wrote a new systemd service file: [Unit] Description=Mosquitto MQTT Broker After=syslog.target rsyslog.target network.target [Service] Type=simple PIDFile=/var/run/mosquitto/mosquitto.pid User=mosquitto PermissionsStartOnly=true ExecStartPre=-/bin/mkdir /run/mosquitto ExecStartPre=/bin/chown -R mosquitto:mosquitto /run/mosquitto ExecStart=/usr/sbin/mosquitto --config-file /etc/mosquitto/mosquitto.conf ExecReload=/bin/kill -s HUP$MAINPID

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=mosquitto

[Install]
WantedBy=multi-user.target

This is broadly similar to the service file I developed in my earlier tutorial post, but it's slightly more complicated.

For one, I use PermissionsStartOnly=true and a series of ExecStartPre directives to allow mosquitto to create a PID file in a directory in /run. /run is a special directory on Linux for PID files and other such things, but normally only root can modify it. mosquitto will be running under the mosquitto user (surprise surprise), so we need to create a subdirectory for it and chown it so that it has write permissions.

A PID file is just a regular file on disk that contains the PID (Process IDentifier) number of the primary process of a system service. System service managers such as systemd and OpenRC use this number to manage the health of the service while it's running and send it various signals (such as to ask it to reload its configuration file).

With this in place, I then added an rsyslog definition at /etc/rsyslog.d/mosquitto.conf to tell it where to put the log files:

if $programname == 'mosquitto' then /var/log/mosquitto/mosquitto.log if$programname == 'mosquitto' then stop

Thinking about it, I should probably check that a log rotation definition file is also in place.

Just in case, I then chowned the pre-existing log files to ensure that rsyslog could read & write to it:

sudo chown -R syslog: /var/log/mosquitto

Then, I filled out /etc/mosquitto/mosquitto.conf with a few extra directives and restarted the service. Here's the full configuration file:

# Place your local configuration in /etc/mosquitto/conf.d/
#
# A full description of the configuration file is at
# /usr/share/doc/mosquitto/examples/mosquitto.conf.example

# NOTE: We can't use tab characters here, as mosquitto doesn't like it.

pid_file /run/mosquitto/mosquitto.pid

# Persistence configuration
persistence true
persistence_location /var/lib/mosquitto/

# Not a file today, thanks
# Log files will actually end up at /var/llog/mosquitto/mosquitto.log, but will go via syslog
# See /etc/rsyslog.d/mosquitto.conf
#log_dest file /var/log/mosquitto/mosquitto.log
log_dest syslog

include_dir /etc/mosquitto/conf.d

# Documentation: https://mosquitto.org/man/mosquitto-conf-5.html

allow_anonymous false
# ....which are stored in the following file

# Make a log entry when a client connects & disconnects, to aid debugging
connection_messages true

# TLS configuration
# Disabled at the moment, since we don't yet have a letsencrypt cert
# NOTE: I don't think that the sensors currently connect over TLS. We should probably fix this.
# TODO: Point these at letsencrypt
#cafile /etc/mosquitto/certs/ca.crt
#certfile /etc/mosquitto/certs/hostname.localdomain.crt
#keyfile /etc/mosquitto/certs/hostname.localdomain.key

As you can tell, I've still got some work to do here - namely the TLS setup. It's a bit of a chicken-and-egg problem, because I need the domain name to be pointing at the MQTT server in order to get a Let's Encrypt TLS certificate, but that'll break all the sensors using the current one..... I'm sure I'll figure it out.

But wait! We forgot the user accounts. Before I started the new service, I added some user accounts for client applications to connect with:

sudo mosquitto_passwd /etc/mosquitto/mosquitto_users username1
sudo mosquitto_passwd /etc/mosquitto/mosquitto_users username1

The mosquitto_passwd program prompts for a password - that way you don't end up with the passwords in your ~/.bash_history file.

With all that taken care of, I started the systemd service:

sudo systemctl daemon-reload
sudo systemctl start mosquitto-broker.service

Of course, I ended up doing a considerable amount of debugging in between all this - I've edited it down to make it more readable and fit better in a blog post :P

Lastly, because I'm paranoid, I double-checked that it was running with htop and netstat:


sudo netstat -peanut | grep -i mosquitto
tcp        0      0 0.0.0.0:1883            0.0.0.0:*               LISTEN      112        2676558    5246/mosquitto
tcp        0      0 x.y.z.w:1883           x.y.z.w:54657       ESTABLISHED 112        2870033    1234/mosquitto
tcp        0      0 x.y.z.w:1883           x.y.z.w:39365       ESTABLISHED 112        2987984    1234/mosquitto
tcp        0      0 x.y.z.w:1883           x.y.z.w:58428       ESTABLISHED 112        2999427    1234/mosquitto
tcp6       0      0 :::1883                 :::*                    LISTEN      112        2676559    1234/mosquitto


...no idea why it want to connect to itself, but hey! Whatever floats its boat.

Own your Code, Part 1: Git Hosting - How did we get here?

Somewhat recently, I posted about how I fixed a nasty problem with an lftp upload. I mentioned that I'd been setting up continuous deployment for an application that I've been writing.

There's actually quite a bit of a story behind how I got to that point, so I thought I'd post about it here. Starting with code hosting, I'm going to show how I setup my own private git server, followed by Laminar (which, I might add, is not for everyone. It's actually quite involved), and finally I'll take a look at continuous deployment.

The intention is to do so in a manner that enables you to do something similar for yourself too (If you have any questions along the way, comment below!).

Of course, this is far too much to stuff into a single blog post - so I'll be splitting it up into a little bit of a mini-series.

Personally, I use git for practically all the code I write, so it makes sense for me to use services such as GitLab and GitHub for hosting these in a public place so that others can find them.

This is all very well, but I do find that I've acquired a number of private projects (say, for University work) that I can't / don't want to open-source. In addition, I'd feel a lot better if I had a backup mirror of the important code repositories I host on 3rd party sites - just in case.

This is where hosting one's own git server comes into play. I've actually blogged about this before, but since then I've moved from Go Git Service to Gitea, a fork of Gogs though a (rather painful; also this) migration.

This post will be more of a commentary on how I went about it, whilst giving some direction on how to do it for yourself. Every server is very different, which makes giving concrete instructions challenging. In addition, I ended up with a seriously non-standard install procedure - which I can't recommend! I need to get around to straightening a few things out at some point.....

So without further hesitation, let's setup Gitea as our Git server! To do so, we'll need an Nginx web server setup already. If you haven't, try following this guide and then come back here.

DNS

Next, you'll need to point a new subdomain at your server that's going to be hosting your Git server. If you've already got a domain name pointed at it (e.g. with A / AAAA records), I can recommend using a CNAME record that points at this pre-existing domain name.

For example, if I have a pair of records for control.bobsrockets.com:

A       control.bobsrockets.com.    1.2.3.4
AAAA    control.bobsrockets.com.    2001::1234:5678

...I could create a symlink like this:

CNAME   git.bobsrockets.com         control.bobsrockets.com.

(Note: For the curious, this isn't actually official DNS record syntax. It's just pseudo-code I invented on-the-fly)

Installation

With that in place, the next order of business is actually installing Gitea. This is relatively simple, but a bit of a pain - because native packages (e.g. sudo apt install ....) aren't a thing yet.

Instead, you download a release binary from the releases page. Once done, we can do some setup to get all our ducks in a row. When setting it up myself, I ended up with a rather weird configuration - as I actually started with a Go Git Service instance before Gitea was a thing (and ended up going through a rather painful) - so you should follow their guide and have a 'normal' installation :P

Once done, you should have Gitea installed and the right directory structure setup.

A note here is that if you're like me and you have SSH running on a non-standard port, you've got 2 choices. Firstly, you can alter the SSH_PORT directive in the configuration file (which should be called app.ini) to match that of your SSH server.

If you decide that you want it to run it's own inbuilt SSH server on port 22 (or any port below 1024), what the guide doesn't tell you is that you need to explicitly give the gitea binary permission to listen on a privileged port. This is done like so:

setcap 'cap_net_bind_service=+ep' gitea

Note that every time you update Gitea, you'll have to re-run that command - so it's probably a good idea to store it in a shell script that you can re-execute at will.

At this point it might also be worth looking through the config file (app.ini I mentioned earlier). There's a great cheat sheet that details the settings that can be customised - some may be essential to configuring Gitea correctly for your environment and use-case.

Updates to Gitea are, of course, important. GitHub provides an Atom Feed that you can use to keep up-to-date with the latest releases.

Later on this series, we'll take a look at how we can automate the process by taking advantage of cron, Laminar CI, and fpm - amongst other tools. I haven't actually done this yet as of the time of typing and we've got a looong way to go until we get to that point - so it's a fair ways off.

We've got Gitea installed and we've considered updates, so the natural next step is to configure it as a system service.

This is the service file I use:

[Unit]
Description=Gitea
After=syslog.target
After=rsyslog.service
After=network.target
#After=mysqld.service
#After=postgresql.service
#After=memcached.service
#After=redis.service

[Service]
# Modify these two values and uncomment them if you have
# repos with lots of files and get an HTTP error 500 because
# of that
###
#LimitMEMLOCK=infinity
#LimitNOFILE=65535
Type=simple
User=git
Group=git
WorkingDirectory=/srv/git/gitea
ExecStart=/srv/git/gitea/gitea web
Restart=always
Environment=USER=git HOME=/srv/git

[Install]
WantedBy=multi-user.target

I believe I took it from here when I migrated from Gogs to Gitea. Save this as /etc/systemd/system/gitea.service, and then do this:

sudo systemctl daemon-reload
sudo systemctl start gitea.service

This should start Gitea as a system service.

Wiring it up

The next step now that we've got Gitea running is to reverse-proxy it with Nginx that we set up earlier.

Create a new file at /etc/nginx/conf.d/2-git.conf, and paste in something like this (not forgetting to customise it to your own use-case):

server {
listen  80;
listen  [::]:80;

server_name git.starbeamrainbowlabs.com;
return 301 https://$host$request_uri;
}

upstream gitea {
server  [::1]:3000;
keepalive 4; # Keep 4 connections open as a cache
}

server {
listen  443 ssl http2;
listen  [::]:443 ssl http2;

server_name git.starbeamrainbowlabs.com;
ssl_certificate     /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/privkey.pem;

#index  index.html index.php;
#root   /srv/www;

location / {
proxy_pass          http://gitea;

#proxy_set_header   host                $host; #proxy_set_header x-originating-ip$remote_addr;
#proxy_set_header   x-forwarded-for     $remote_addr; proxy_hide_header X-Frame-Options; } location ~ /.well-known { root /srv/letsencrypt; } #include /etc/nginx/snippets/letsencrypt.conf; #location = / { # proxy_pass http://127.0.0.1:3000; # proxy_set_header x-proxy-server nginx; # proxy_set_header host$host;
#   proxy_set_header    x-originating-ip    $remote_addr; # proxy_set_header x-forwarded-for$remote_addr;
#}

#location = /favicon.ico {
#   alias /srv/www/favicon.ico;
#}
}

You may have to comment out the listen 443 blocks and put in a listen 80 temporarily whilst configuring letsencrypt.

Then, reload Nginx: sudo systemctl reload nginx

Conclusion

Phew! We've looked at installing and setting up Gitea behind Nginx, and using a systemd service to automate the management of Gitea.

I've also talked a bit about how I set my own Gitea instance up and why.

In future posts, I'm going to talk about Continuous Integration, and how I setup Laminar CI. I'll also talk about alternatives for those who want something that comes with a few more batteries included.... :P

Found this interesting? Got stuck and need help? Spotted a mistake? Comment below!

Automatically rotating log files on Linux

I'm rather busy at the moment with University, but I thought I'd post about Linux's log rotating system, which I've discovered recently. This post is best read as a follow-up to my earlier post, creating a system service with systemd, in which I talk about how to write a systemd service file - and how to send the output of your program to syslog - which will put it in /var/log for you.

Log rotating is the practice of automatically renaming and moving log files around at regular intervals - and keeping only so many log files at once. For example, I might define the following rules:

• Rotate the log files every week
• Keep 10 log files in total
• Compress log files past the 2nd one

This would yield me a set of log files like this, for instance:

dpkg.log
dpkg.log.1
dpkg.log.2.gz
dpkg.log.3.gz
dpkg.log.4.gz
dpkg.log.5.gz
dpkg.log.6.gz
dpkg.log.7.gz
dpkg.log.8.gz
dpkg.log.9.gz
dpkg.log.10.gz

When the logs are next rotated, the last one is deleted and all the rest are renamed sequentially - like 10 in the bed.

Compressing log files is good for saving space, but in order to read them again we have to fiddle about with zcat / gzip.

The log rotating system on Linux is a cron job that runs at regular intervals - it doesn't run as a system service. It's configured by a series of files in /etc/logrotate.d/ - 1 for each service that has log files that want rotating automatically. Here's an example definition file:

/var/log/rhinoreminds/rhinoreminds.log {
rotate 12
weekly
missingok
notifempty
compress
delaycompress
}

Basically you specify the filename first, and then a bunch of directives to tell it what to do inside { }. The above is for RhinoReminds, an XMPP reminder bot I've written, and defines the following:

• Keep 12 log files in the rotation cycle
• Rotate the logs every week
• It's ok if the log file doesn't exist
• Don't rotate the log file if it's empty
• Compress log files on rotation if they aren't already
• .....but delay this by 1 rotation cycle

Very cool! This should produce the following:

/var/log/rhinoreminds/rhinoreminds.log
/var/log/rhinoreminds/rhinoreminds.log.1
/var/log/rhinoreminds/rhinoreminds.log.2.gz
/var/log/rhinoreminds/rhinoreminds.log.3.gz
/var/log/rhinoreminds/rhinoreminds.log.4.gz
/var/log/rhinoreminds/rhinoreminds.log.5.gz
/var/log/rhinoreminds/rhinoreminds.log.6.gz
/var/log/rhinoreminds/rhinoreminds.log.7.gz
/var/log/rhinoreminds/rhinoreminds.log.8.gz
/var/log/rhinoreminds/rhinoreminds.log.9.gz
/var/log/rhinoreminds/rhinoreminds.log.10.gz
/var/log/rhinoreminds/rhinoreminds.log.11.gz
/var/log/rhinoreminds/rhinoreminds.log.12.gz

Setup your very own VPN in 10 minutes flat

Hey! Happy new year :-)

I've been looking to setup a personal VPN for a while, and the other week I discovered a rather brilliant project called PiVPN, which greatly simplifies the process of setting one up - and managing it thereafter.

It's been working rather well so far, so I thought I'd post about it so you can set one up for yourself too. But first though, we should look at the why. Why a VPN? What does it do?

Basically, a VPN let you punch a great big hole in the network that you're connected to and appear as if you're actually on a network elsewhere. The extent to which this is the case varies depending on the purpose, (for example a University or business might setup a VPN that allows members to access internal resources, but doesn't route all traffic through the VPN), but the general principle is the same.

It's best explained with a diagram. Imagine you're at a Café:

Everyone on the Café's WiFi can see the internet traffic you're sending out. If any of it is unencrypted, then they can additionally see the content of said traffic - e.g. emails you send, web pages you load, etc. Even if it's encrypted, statistical analysis can reveal which websites you're visiting and more.

If you don't trust a network that you're connected to, then by utilising a VPN you can create an encrypted tunnel to another location that you do trust:

Then, all that the other users of the Café's WiFi will see is an encrypted stream of packets - all heading for the same destination. All they'll know is roughly how much traffic you're sending and receiving, but not to where.

This is the primary reason that I'd like my own VPN. I trust the network I've got setup in my own house, so it stands to reason that I'd like to setup a VPN server there, and pretend that my devices when I'm out and about are still at home.

In theory, I should be able to access the resources on my home network too when I'm using such a VPN - which is an added bonus. Other reasons do exist for using a VPN, but I won't discuss them here.

In terms of VPN server software, I've done a fair amount of research into the different options available. My main criteria are as follows:

• Fairly easy to install
• Easy to understand what it's doing once installed (transparency)
• Easy to manage

The 2 main technologies I came across were OpenVPN and IPSec. Each has their own strengths & weaknesses. An IPSec VPN is, apparently, more efficient - especially since it executes on the client in kernel-space instead of user-space. It's a lighter protocol, too - leading to less overhead. It's also much more likely to be detected and blocked when travelling through strict firewalls, making me slightly unsure about it.

OpenVPN, on the other hand, executes entirely in user-space on both the client and the server - leading to a slightly greater overhead (especially with the mitigations for the recent Spectre & Meltdown hardware bugs). It does, however, use TLS (though over UDP by default). This characteristic makes it much more likely it'll slip through stricter firewalls. I'm unsure if that's a quality that I'm actually after or not.

Ultimately, it's the ease of management that points the way to my final choice. Looking into it, with both choices there's complex certificate management to be done whenever you want to add a new client to the VPN. For example, with StrongSwan (an open-source IPSec VPN program), you've got to generate a number of certificates with a chain of rather long commands - and the users themselves have passwords stored in plain text in a file!

While I've got no problem with reading and understanding such commands, I do have a problem with rememberability. If I want to add a new client, how easy is that to do? How long would I have to spend re-reading documentation to figure out how to do it?

Sure, I could write a program to manage the configuration files for me, but that would also require maintenance - and probably take much longer than I anticipate to write.

I forget where I found it, but it is for this reason that I ultimately decided to choose PiVPN. It's a set of scripts that sets up and manages one's an OpenVPN installation. To this end, it provides a single command - pivpn - that can be used to add, remove, and list clients and their statistics. With a concise help text, it makes it easy to figure out how to perform common tasks utilising existing terminal skills by conforming to established CLI interface norms.

If you want to install it yourself, then simply do this:

curl -L https://install.pivpn.io | bash

curl -L https://install.pivpn.io | less

Once you're happy that it's not going to do anything malign to your system, proceed with the installation by executing the 1st command. It should guide you through a number of screens. Some important points I ran into:

• It asks you to install and enable unattended-upgrades. You should probably do this, but I ended up skipping this - as I've already got apticron setup and sending me regular emails - as I rather like to babysit the upgrade of packages on the main machines I manage. I might look into unattended-upgrades in the future if I acquire more servers than are comfortable to manage this way.
• Make sure you fully update your system before running the installation. I use this command: sudo apt update && sudo apt-get dist-upgrade && sudo apt-get autoclean && sudo apt-get autoremove
• Changing the port of the VPN isn't a bad idea, since PiVPN will automatically assemble .ovpn configuration files for you. I didn't end up doing this to start with, but I can always change it in the NAT rule I configured on my router later.
• Don't forget to allow OpenVPN through your firewall! For ufw users (like me), then it's something like sudo ufw allow <port_number>/udp.
• Don't forget to setup a NAT rule / port forwarding on your router if said server doesn't have a public IP address (if it's IPv4 it probably doesn't). If you're confused on this point, comment below and I'll blog about it. It's..... a complicated topic.

If you'd like a more in-depth guide to setting up PiVPN, then I can recommend this guide. It's a little bit dated (PiVPN now uses elliptical-curve cryptography by default), but still serves to illustrate the process pretty well.

If you're confused about some of the concepts I've presented here - leave a comment below! I'm happy to explain them in more detail. Who knows - I might end up writing another blog post on the subject....

Backing up to AWS S3 with duplicity

The server that this website runs on backs up automatically to the Simple Storage Service, provided by Amazon Web Services. Such an arrangement is actually fairly cheap - only ~20p/month! I realised recently that although I've blogged about duplicity before (where I discussed using an external hard drive), I never covered how I fully automate the process here on starbeamrainbowlabs.com.

(Above: A bunch of hard drives. The original can be found here.)

It's fairly similar in structure to the way it works backing up to an external hard drive - just with a few different components here and there, as the script that drives this is actually older than the one that backs up to an external hard drive.

To start, we'll need an AWS S3 bucket. I'm not going to cover how to do this here, as the AWS interface keeps changing, and this guide will likely become outdated quickly. Instead, the AWS S3 documentation has an official guide on how to create one. Make sure it's private, as you don't want anyone getting a hold of your backups!

With that done, you should have both an access key and a secret. Note these down in a file called .backup-password in a new directory that will hold the backup script like this:

#!/usr/bin/env bash
AWS_ACCESS_KEY_ID="INSERT_AWS_ACCESS_KEY_HERE";
AWS_SECRET_ACCESS_KEY="INSERT_AWS_SECRET_KEY_HERE";

The PASSPHRASE here should be a long and unintelligible string of random characters, and will encrypt your backups. Note that down somewhere safe too - preferably in your password manager or somewhere else at least as secure.

If you're on Linux, you should also set the permissions on the .backup-password file to ensure nobody gets access to it who shouldn't. Here's how I did it:

sudo chown root:root .backup-password
sudo chmod 0400 .backup-password

This ensures that only the root user is able to read the file - and nobody can write to it. With our secrets generated and safely stored, we can start writing the backup script itself. Let's start by reading in the secrets:

#!/usr/bin/env bash
source /root/.backup-password

I stored my .backup-password file in /root. Next, let's export these values. This enables the subprocesses we invoke to access these environment variables:

export PASSPHRASE;
export AWS_ACCESS_KEY_ID;
export AWS_SECRET_ACCESS_KEY;

Now it's time to do the backup itself! Here's what I do:

duplicity \
--full-if-older-than 2M \
--exclude /proc \
--exclude /sys \
--exclude /tmp \
--exclude /dev \
--exclude /mnt \
--exclude /var/cache \
--exclude /var/tmp \
--exclude /var/backups \
--exclude /srv/www-mail/rainloop/v \
--s3-use-new-style --s3-european-buckets --s3-use-ia \
/ s3://s3-eu-west-1.amazonaws.com/INSERT_BUCKET_NAME_HERE

Compressed version:

duplicity --full-if-older-than 2M --exclude /proc --exclude /sys --exclude /tmp --exclude /dev --exclude /mnt --exclude /var/cache --exclude /var/tmp --exclude /var/backups --exclude /srv/www-mail/rainloop/v --s3-use-new-style --s3-european-buckets --s3-use-ia / s3://s3-eu-west-1.amazonaws.com/INSERT_BUCKET_NAME_HERE

This might look long and complicated, but it's mainly due to the large number of directories that I'm excluding from the backup. The key options here are --full-if-older-than 2M and --s3-use-ia, which specify I want a full backup to be done every 2 months and to use the infrequent access pricing tier to reduce costs.

The other important bit here is to replace INSERT_BUCKET_NAME_HERE with the name of the S3 bucket that you created.

Backing is all very well, but we want to remove old backups too - in order to avoid ridiculous bills (AWS are terrible for this - there's no way that you can set a hard spending limit! O.o). That's fairly easy to do:

duplicity remove-older-than 4M \
--force \
--s3-use-new-style --s3-european-buckets --s3-use-ia \
s3://s3-eu-west-1.amazonaws.com/INSERT_BUCKET_NAME_HERE

Again, don't forget to replace INSERT_BUCKET_NAME_HERE with the name of your S3 bucket. Here, I specify I want all backups older than 4 months (the 4M bit) to be deleted.

It's worth noting here that it may not actually be able to remove backups older than 4 months here, as it can only delete a full backup if there are not incremental backups that depend on it. To this end, you'll need to plan for potentially storing (and being charged for) an extra backup cycle's worth of data. In my case, that's an extra 2 months worth of data.

That's the backup part of the script complete. If you want, you could finish up here and have a fully-working backup script. Personally, I want to know how much data is in my S3 bucket - so that I can get an idea as to how much I'll be charged when the bill comes through - and also so that I can see if anything's going wrong.

Unfortunately, this is a bit fiddly. Basically, we have to utilise the AWS command-line interface to recursively list the entire contents of our S3 bucket in summarising mode in order to get it to tell us what we want to know. Here's how to do that:

aws s3 ls s3://INSERT_BUCKET_BAME_HERE --recursive --human-readable --summarize

Don't forget to replace INSERT_BUCKET_BAME_HERE wiith your bucket's name. The output from this is somewhat verbose, so I ended up writing an awk script to process it and output something nicer. Said awk script looks like this:

/^\s*Total\s+Objects/ { parts[i++] = $3 } /^\s*Total\s+Size/ { parts[i++] =$3; parts[i++] = $4; } END { print( "AWS S3 Bucket Status:", parts[0], "objects, totalling " parts[1], parts[2] ); } If we put all that together, it should look something like this: aws s3 ls s3://INSERT_BUCKET_BAME_HERE --recursive --human-readable --summarize | awk '/^\s*Total\s+Objects/ { parts[i++] =$3 } /^\s*Total\s+Size/ { parts[i++] = $3; parts[i++] =$4; } END { print("AWS S3 Bucket Status:", parts[0], "objects, totalling " parts[1], parts[2]); }'

...it's a bit of a mess. Perhaps I should look at putting that awk script in a separate file :P Anyway, here's some example output:

AWS S3 Bucket Status: 602 objects, totalling 21.0 GiB Very nice indeed. To finish off, I'd rather like to know how long it took to do all this. Thankfully, bash has an inbuilt automatic variable that holds the number of seconds since the current process has started, so it's just a case of parsing this out into something readable:

echo "Done in $(($SECONDS / 3600))h $((($SECONDS / 60) % 60))m $(($SECONDS % 60))s.";

...I forget which Stackoverflow answer it was that showed this off, but if you know - please comment below and I'll update this to add credit. This should output something like this:

Done in 0h 12m 51s.

Awesome! We've now got a script that backs up to AWS S3, deletes old backups, and tells us both how much space on S3 is being used and how long the whole process took.

I'm including the entire script at the bottom of this post. I've changed it slightly to add a single variable for the bucket name - so there's only 1 place on line 9 (highlighted) you need to update there.

(Above: A Geopattern, tiled using the GNU Image Manipulation Program)


#!/usr/bin/env bash

# Make sure duplicity exists
test -x $(which duplicity) || exit 1; # Pull in the password . /root/.backup-password AWS_S3_BUCKET_NAME="INSERT_BUCKET_NAME_HERE"; # Allow duplicity to access it export PASSPHRASE; export AWS_ACCESS_KEY_ID; export AWS_SECRET_ACCESS_KEY; # Actually do the backup # Backup strategy: # 1 x backup per week: # 1 x full backup per 2 months # incremental backups in between # S3 Bucket URI: https://${AWS_S3_BUCKET_NAME}/
echo [ $(date +%F%r) ] Performing backup. duplicity --full-if-older-than 2M --exclude /proc --exclude /sys --exclude /tmp --exclude /dev --exclude /mnt --exclude /var/cache --exclude /var/tmp --exclude /var/backups --exclude /srv/www-mail/rainloop/v --s3-use-new-style --s3-european-buckets --s3-use-ia / s3://s3-eu-west-1.amazonaws.com/${AWS_S3_BUCKET_NAME}

# Remove old backups
# You have to plan for 1 extra full backup cycle when
# calculating space requirements - duplicity only
# removes a backup if it won't invalidate those further
# along the chain - the oldest backup will always be
# a full one.
echo [ $(date +%F%r) ] Backup complete. Removing old volumes. duplicity remove-older-than 4M --force --encrypt-key F2A6D8B6 --s3-use-new-style --s3-european-buckets --s3-use-ia s3://s3-eu-west-1.amazonaws.com/${AWS_S3_BUCKET_NAME}
echo [ $(date +%F%r) ] Cleanup complete. aws s3 ls s3://${AWS_S3_BUCKET_NAME} --recursive --human-readable --summarize | awk '/^\s*Total\s+Objects/ { parts[i++] = $3 } /^\s*Total\s+Size/ { parts[i++] =$3; parts[i++] = $4; } END { print("AWS S3 Bucket Status:", parts[0], "objects, totalling " parts[1], parts[2]); }' echo "Done in$(($SECONDS / 3600))h$((($SECONDS / 60) % 60))m$(($SECONDS % 60))s.";  Write an XMPP bot in half an hour Recently I've looked at using AI to extract key information from natural language, and creating a system service with systemd. The final piece of the puzzle is to write the bot itself - and that's what I'm posting about today. Since not only do I use XMPP for instant messaging already but it's an open federated standard, I'll be building my bot on top of it for maximum flexibility. To talk over XMPP programmatically, we're going to need library. Thankfully, I've located just such a library which appears to work well enough, called S22.XMPP. Especially nice is the comprehensive documentation that makes development go much more smoothly. With our library in hand, let's begin! Our first order of business is to get some scaffolding in place to parse out the environment variables we'll need to login to an XMPP account. using System; using System.Linq; using System.Threading; using System.Threading.Tasks; using S22.Xmpp; using S22.Xmpp.Client; using S22.Xmpp.Im; namespace XmppBotDemo { public static class MainClass { // Needed later private static XmppClient client; // Settings private static Jid ourJid = null; private static string password = null; public static int Main(string[] args) { // Read in the environment variables ourJid = new Jid(Environment.GetEnvironmentVariable("XMPP_JID")); password = Environment.GetEnvironmentVariable("XMPP_PASSWORD"); // Ensure they are present if (ourJid == null || password == null) { Console.Error.WriteLine("XMPP Bot Demo"); Console.Error.WriteLine("============="); Console.Error.WriteLine(""); Console.Error.WriteLine("Usage:"); Console.Error.WriteLine(" ./XmppBotDemo.exe"); Console.Error.WriteLine(""); Console.Error.WriteLine("Environment Variables:"); Console.Error.WriteLine(" XMPP_JID Required. Specifies the JID to login with."); Console.Error.WriteLine(" XMPP_PASSWORD Required. Specifies the password to login with."); return 1; } // TODO: Connect here return 0; } } } Excellent! We're reading in & parsing 2 environment variables: XMPP_JID (the username), and XMPP_PASSWORD. It's worth noting that you can call these environment variables anything you like! I chose those names as they describe their contents well. It's also worth mentioning that it's important to use environment variables for secrets passing them as command-line arguments cases them to be much more visible to other uses of the system! Let's connect to the XMPP server with our newly read-in credentials: // Create the client instance client = new XmppClient(ourJid.Domain, ourJid.Node, password); client.Error += errorHandler; client.SubscriptionRequest += subscriptionRequestHandler; client.Message += messageHandler; client.Connect(); // Wait for a connection while (!client.Connected) Thread.Sleep(100); Console.WriteLine($"[Main] Connected as {ourJid}.");

// Wait forever.

// TODO: Automatically reconnect to the server when we get disconnected.

Cool! Here, we create a new instance of the XMPPClient class, and attach 3 event handlers, which we'll look at later. We then connect to the server, and then wait until it completes - and then write a message to the console. It looks like S22.Xmpp spins up a new thread, so unfortunately we can't catch any errors it throws with a traditional try-catch statement. Instead, we'll have to ensure we're really careful that we catch any exceptions we throw accidentally - otherwise we'll get disconnected!

It does appear that XmppClient catches some errors though, which trigger the Error event - so we should attach an event handler to that.

/// <summary>
/// Handles any errors thrown by the XMPP client engine.
/// </summary>
private static void errorHandler(object sender, ErrorEventArgs eventArgs) {
Console.Error.WriteLine($"Error: {eventArgs.Reason}"); Console.Error.WriteLine(eventArgs.Exception); } Before a remote contact is able to talk to our bot, they will send us a subscription request - which we'll need to either accept or reject. This is also done via an event handler. It's the SubscriptionRequest one this time: /// <summary> /// Handles requests to talk to us. /// </summary> /// <remarks> /// Only allow people to talk to us if they are on the same domain we are. /// You probably don't want this for production, but for developmental purposes /// it offers some measure of protection. /// </remarks> /// <param name="from">The JID of the remote user who wants to talk to us.</param> /// <returns>Whether we're going to allow the requester to talk to us or not.</returns> public static bool subscriptionRequestHandler(Jid from) { Console.WriteLine($"[Handler/SubscriptionRequest] {from} is requesting access, I'm saying {(from.Domain == ourJid.Domain?"yes":"no")}");
return from.Domain == ourJid.Domain;
}

This simply allows anyone on our own domain to talk to us. For development purposes this will offer us some measure of protection, but for production you should probably implement a whitelisting or logging system here.

The other interesting thing we can do here is send a user a chat message to either welcome them to the server, or explain why we rejected their request. To do this, we need to write a pair of utility methods, as sending chat messages with S22.Xmpp is somewhat over-complicated:

#region Message Senders

/// <summary>
/// Sends a chat message to the specified JID.
/// </summary>
/// <param name="to">The JID to send the message to.</param>
/// <param name="message">The messaage to send.</param>
private static void sendChatMessage(Jid to, string message)
{
//Console.WriteLine($"[Bot/Send/Chat] Sending {message} -> {to}"); client.SendMessage( to, message, null, null, MessageType.Chat ); } /// <summary> /// Sends a chat message in direct reply to a given incoming message. /// </summary> /// <param name="originalMessage">Original message.</param> /// <param name="reply">Reply.</param> private static void sendChatReply(Message originalMessage, string reply) { //Console.WriteLine($"[Bot/Send/Reply] Sending {reply} -> {originalMessage.From}");
client.SendMessage(
);
}

#endregion


The difference between these 2 methods is that one sends a reply directly to a message that we've received (like a threaded reply), and the other simply sends a message directly to another contact.

Now that we've got all of our ducks in a row, we can write the bot itself! This is done via the Message event handler. For this demo, we'll write a bot that echo any messages to it in reverse:

/// <summary>
/// Handles incoming messages.
/// </summary>
private static void messageHandler(object sender, MessageEventArgs eventArgs) {
Console.WriteLine($"[Bot/Handler/Message] {eventArgs.Message.Body.Length} chars from {eventArgs.Jid}"); char[] messageCharArray = eventArgs.Message.Body.ToCharArray(); Array.Reverse(messageCharArray); sendChatReply( eventArgs.Message, new string(messageCharArray) ); } Excellent! That's our bot complete. The full program is at the bottom of this post. Of course, this is a starting point - not an ending point! A number of issues with this demo stand out. There isn't a whitelist, and putting the whole program in a single file doesn't sound like a good idea. The XMPP logic should probably be refactored out into a separate file, in order to keep the input settings parsing separate from the bot itself. Other issues that probably need addressing include better error handling and more - but fixing them all here would complicate the example rather. Edit: The code is also available in a git repository if you'd like to clone it down and play around with it :-) Found this interesting? Got a cool use for it? Still confused? Comment below! Complete Program using System; using System.Linq; using System.Threading; using System.Threading.Tasks; using S22.Xmpp; using S22.Xmpp.Client; using S22.Xmpp.Im; namespace XmppBotDemo { public static class MainClass { private static XmppClient client; private static Jid ourJid = null; private static string password = null; public static int Main(string[] args) { // Read in the environment variables ourJid = new Jid(Environment.GetEnvironmentVariable("XMPP_JID")); password = Environment.GetEnvironmentVariable("XMPP_PASSWORD"); // Ensure they are present if (ourJid == null || password == null) { Console.Error.WriteLine("XMPP Bot Demo"); Console.Error.WriteLine("============="); Console.Error.WriteLine(""); Console.Error.WriteLine("Usage:"); Console.Error.WriteLine(" ./XmppBotDemo.exe"); Console.Error.WriteLine(""); Console.Error.WriteLine("Environment Variables:"); Console.Error.WriteLine(" XMPP_JID Required. Specifies the JID to login with."); Console.Error.WriteLine(" XMPP_PASSWORD Required. Specifies the password to login with."); return 1; } // Create the client instance client = new XmppClient(ourJid.Domain, ourJid.Node, password); client.Error += errorHandler; client.SubscriptionRequest += subscriptionRequestHandler; client.Message += messageHandler; client.Connect(); // Wait for a connection while (!client.Connected) Thread.Sleep(100); Console.WriteLine($"[Main] Connected as {ourJid}.");

// Wait forever.

// TODO: Automatically reconnect to the server when we get disconnected.

return 0;
}

#region Event Handlers

/// <summary>
/// Handles requests to talk to us.
/// </summary>
/// <remarks>
/// Only allow people to talk to us if they are on the same domain we are.
/// You probably don't want this for production, but for developmental purposes
/// it offers some measure of protection.
/// </remarks>
/// <param name="from">The JID of the remote user who wants to talk to us.</param>
/// <returns>Whether we're going to allow the requester to talk to us or not.</returns>
public static bool subscriptionRequestHandler(Jid from) {
Console.WriteLine($"[Handler/SubscriptionRequest] {from} is requesting access, I'm saying {(from.Domain == ourJid.Domain?"yes":"no")}"); return from.Domain == ourJid.Domain; } /// <summary> /// Handles incoming messages. /// </summary> private static void messageHandler(object sender, MessageEventArgs eventArgs) { Console.WriteLine($"[Handler/Message] {eventArgs.Message.Body.Length} chars from {eventArgs.Jid}");
char[] messageCharArray = eventArgs.Message.Body.ToCharArray();
Array.Reverse(messageCharArray);
eventArgs.Message,
new string(messageCharArray)
);
}

/// <summary>
/// Handles any errors thrown by the XMPP client engine.
/// </summary>
private static void errorHandler(object sender, ErrorEventArgs eventArgs) {
Console.Error.WriteLine($"Error: {eventArgs.Reason}"); Console.Error.WriteLine(eventArgs.Exception); } #endregion #region Message Senders /// <summary> /// Sends a chat message to the specified JID. /// </summary> /// <param name="to">The JID to send the message to.</param> /// <param name="message">The messaage to send.</param> private static void sendChatMessage(Jid to, string message) { //Console.WriteLine($"[Rhino/Send/Chat] Sending {message} -> {to}");
client.SendMessage(
to, message,
null, null, MessageType.Chat
);
}
/// <summary>
/// Sends a chat message in direct reply to a given incoming message.
/// </summary>
/// <param name="originalMessage">Original message.</param>
}