Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

NSD, Part 2: Dynamic DNS

Hey there! In the last post, I showed you how to setup nsd, the Name Server Daemon, an authoritative DNS server to serve records for a given domain. In this post, I'm going to talk through how to extend that configuration to support Dynamic DNS.

Normally, if you query, say, the A or AAAA records for a domain or subdomain like git.starbeamrainbowlabs.com, it will return the same IP address that you manually set in the DNS zone file, or if you use some online service then the value you manually set there. This is fine if your IP address does not change, but becomes problematic if your IP address may change unpredictably.

The solution, as you might have guessed, lies in dynamic DNS. Dynamic DNS is a fancy word for some kind of system where the host system that a DNS record points to (e.g. compute.bobsrockets.com) informs the DNS server about changes to its IP address.

This is done by making a network request from the host system to some kind of API that automatically updates the DNS server - usually over HTTP (though anything else could work too, but please make sure it's encrypted!).

You may already be familiar with using a HTTP API to inform your cloud-based registrar (e.g. Cloudflare, Gandi, etc) of IP address changes, but in this post we're going to set dynamic DNS up with the nsd server we configured in the previous post mentioned above.

The first order of business is to find some software to do this. You could also write a thing yourself (see also setting up a systemd service). There are several choices, but I went with dyndnsd (I may update this post if I ever write my own daemon for this).

Next, you need to determine what subdomain you'll use for dynamic dns. Since DNS is hierarchical, an entire subdomain is required - you can't just do dynamic DNS for, say, wiki.bobsrockets.com - since dyndnsd will manage it's own DNS zone file, all dynamic DNS hostnames will be under that subdomain - e.g. wiki.dyn.bobsrockets.com.

Configuring the server

For the server, I will be assuming that the dynamic dns daemon will be running on the same server as the nsd daemon.

For this tutorial, we'll be setting it up unencrypted. This is a security risk if you are setting it up to accept requests over the Internet rather than a local trusted network! Notes on how to fix this at the end of this post.

Since this is a Ruby-based program (which I do generally recommend avoiding since Ruby is generally an inefficient language to write a program in I've observed), first we need to install gem, the Ruby package manager:

sudo apt install ruby ruby-rubygems ruby-dev

Then, we can install the gem Ruby package manager:

sudo gem install dyndnsd

Now, we need to configure it. dyndnsd is configured using a YAML (ew) configuration file. It's probably best to show an example configuration file and explain it afterwards:

# listen address and port
host: "0.0.0.0"
port: 5354
# The internal database file. We'll create this in a moment.
db: "/var/lib/dyndnsd/db.json"
# enable debug mode?
debug: false
# all hostnames are required to be cool-name.dyn.bobsrockets.com
domain: "dyn.bobsrockets.com"
# configure the updater, here we use command_with_bind_zone, params are updater-specific
updater:
  name: "command_with_bind_zone"
  params:
    zone_file: "/etc/dyndnsd/zones/dyn.bobsrockets.com.zone"
    command: "systemctl reload nsd"
    ttl: "5m"
    dns: "bobsrockets.com."
    email_addr: "bob.bobsrockets.com"
# Users with the hostnames they are allowed to create/update
users:
  computeuser: # <--- Username
    password: "alongandrandomstring"
    hosts:
      - compute1.dyn.bobsrockets.com
  computeuser2:
    password: "anotherlongandrandomstring"
    hosts:
      - compute2.dyn.bobsrockets.com
      - compute3.dyn.bobsrockets.com

...several things to note here that I haven't already noted in comments.

  • zone_file: "/etc/nsd/zones/dyn.bobsrockets.com.zone": This is the path to the zone file dyndnsd should update.
  • dns: "bobsrockets.com.": This is the fully-qualified hostname with a dot at the end of the DNS server that will be serving the DNS records (i.e. the nsd server).
  • email_addr: "bob.bobsrockets.com": This sets the email address of the administrator of the system, but the @ at sign is replaced with a dot .. If your email address contains a dot . in the user (e.g. bob.rockets@example.com), then it won't work as expected here.

Also important here is that although when dealing with domains like this it is less confusing to always require a dot . at the end of fully qualified domain names, this is not always the case here.

Once you've written the config file,, create the directory /etc/dyndnsd and write it to /etc/dyndnsd/dyndnsd.yaml.

With the config file written, we now need to create and assign permissions to the data directory it will be using. Do that like so:

sudo useradd --no-create-home --system --home /var/lib/dyndnsd dyndnsd
sudo mkdir /var/lib/dyndnsd
sudo chown dyndnsd:dyndnsd /var/lib/dyndnsd

Also, we need to create the zone file and assign the correct permissions so that it can write to it:

sudo mkdir /etc/dyndnsd/zones
sudo chown dyndnsd:dyndnsd /etc/dyndnsd/zones
# symlink the zone file into the nsd zones directory. This way dyndns isn't allowed to write to all of /etc/nsd/zones - just the 1 zone file it is supposed to update.
sudo ln -s /etc/dyndnsd/zones/dyn.bobsrockets.com.zone /etc/nsd/zones/dyn.bobsrockets.com.zone

Now, we can write a systemd service file to run dyndnsd for us:

[Unit]
Description=dyndnsd: Dynamic DNS record updater
Documentation=https://github.com/cmur2/dyndnsd

[Service]
User=dyndnsd
Group=dyndnsd
ExecStart=/usr/local/bin/dyndnsd /etc/dyndnsd/dyndnsd.yaml
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=dyndnsd

[Install]
WantedBy=multi-user.target

Save this to /etc/systemd/system/dyndnsd.service. Then, start the daemon like so:

sudo systemctl daemon-reload
sudo systemctl enable --now dyndnsd.service

Finally, don't forget to update your firewall to allow requests through to dyndnsd. For UFW, do this:

sudo ufw allow 5354/tcp comment dyndnsd

That completes the configuration of dyndnsd on the server. Now we just need to update the nsd config file to tell it about the new zone.

nsd's config file should be at /etc/nsd/nsd.conf. Open it for editing, and add the following to the bottom:

zone:
    name: dyn.bobsrockets.com
    zonefile: dyn.bobsrockets.com.zone

...and you're done on the server!

Configuring the client(s)

For the clients, all that needs doing is configuring them to make regular requests to the dyndnsd server to keep it appraised of their IP addresses. This is done by making a HTTP request, so we can test it with curl like this:

curl http://computeuser:alongandrandomstring@bobsrockets.com:5354/nic/update?hostname=compute1.dyn.bobsrockets.com

...where computeuser is the username, alongandrandomstring is the password, and compute1.dyn.bobsrockets.com is the hostname it should update.

The server will be able to tell what the IP address is it should set for the subdomain compute1.dyn.bobsrockets.com by the IP address of the client making the request.

The simplest way of automating this is using cron. Add the following cronjob (sudo crontab -e to edit the crontab):

*/5 * * * *     curl -sS http://computeuser:alongandrandomstring@bobsrockets.com:5354/nic/update?hostname=compute1.dyn.bobsrockets.com

....and that's it! It really is that simple. Windows users will need to setup a scheduled task instead and install curl, but that's outside the scope of this post.

Conclusion

In this post, I've given a whistle-stop tour of setting up a simple dynamic dns server. This can be useful if a host as a dynamic IP address on a local network but it still needs a (sub)domain for some reason.

Note that this is not suitable for untrusted networks! For example, setting dyndnsd to accept requests over the Internet is a Bad Idea, as this simple setup is not encrypted.

If you do want to set this up over an untrusted network, you must encrypt the connection to avoid nasty DNS poisoning attacks. Assuming you already have a working reverse proxy setup on the same machine (e.g. Nginx), you'll need to add a new virtual host (a server { } block in Nginx) that reverse-proxies to your dyndnsd daemon and sets the X-Real-IP HTTP header, and then ensure port 5354 is closed on your firewall to prevent direct access.

This is beyond this scope of this post and slightly different depending on your setup, but if there's the demand I can blog about how to do this.

Sources and further reading

The NSD Authoritative DNS Server: What, why, and how

In a previous blog post, I explained how to setup unbound, a recursive resolving DNS server. I demonstrated how to setup a simple split-horizon DNS setup, and forward DNS requests to an upstream DNS server - potentially over DNS-over-TLS.

Recently, for reasons that are rather complicated, I found myself in an awkward situation which required an authoritative DNS server - and given my love of explaining complicated and rather niche concepts here on my blog, I thought this would be a fabulous opportunity to write a 2-part series :P

In this post, I'm going to outline the difference between a recursive resolver and an authoritative DNS server, and explain why you'd want one and how to set one up. I'll explain how it fits as a part of a wider system.

Go grab your snacks - you'll be learning more about DNS than you ever wanted to know....

DNS in a (small) nutshell

As I'm sure you know if you're reading this, DNS stands for the Domain Name System. It translates domain names (e.g. starbeamrainbowlabs.com.) into IP addresses (e.g. 5.196.73.75, or 2001:41d0:e:74b::1). Every network-connected system will make use of a DNS server at one point or another.

DNS functions on records. These define how a given domain name should be resolved to it's corresponding IP address (or vice verse, but that's out-of-scope of this post). While there are many different types of DNS record, here's a quick reference for the most common one's you'll encounter when reading this post.

  • A: As simple as it gets. An A record defines the corresponding IPv4 address for a domain name.
  • AAAA: Like an A record, but for IPv6.
  • CNAME: An alias, like a symlink in a filesystem [Linux] or a directory junction [Windows]
  • NS: Specifies the domain name of the authoritative DNS server that holds DNS records for this domain. See more on this below.

A tale of 2 (DNS) servers

Consider your laptop, desktop, phone, or other device you're reading this on right now. Normally (if you are using DHCP, which is a story for another time), your router (which usually acts as the DHCP server on most home networks) will tell you what DNS server(s) to use.

These servers that your device talks to is what's known as a recursive resolving DNS server. These DNS servers do not have any DNS records themselves: their entire purpose is to ask other DNS servers to resolve queries for them.

At first this seems rather counterintuitive. Why bother when you can have a server that actually hosts the DNS records themselves and just ask that every time instead?

Given the size of the Internet today, this is unfortunately not possible. If we all used the same DNS server that hosted all DNS records, it would be drowned in DNS queries that even the best Internet connection would not be abel to handle. It would also be a single point of failure - bringing the entire Internet crashing down every time maintenance was required.

To this end, a more scaleable system was developed. By having multiple DNS servers between users and the authoritative DNS servers that actually hold the real DNS records, we can ensure the system scales virtually infinitely.

The next question that probably comes to mind is where the name recursive resolvers DNS server comes from. This name comes from the way that these recursive DNS servers ask other DNS servers for the answer to a query, instead of answering based on records they hold locally (though most recursive resolving DNS servers also have a cache for performance, but this is also a tale for another time).

Some recursive resolving DNS servers - such as the one built into your home router - simply asks 1 or 2 upstream DNS servers - usually either provided by your ISP or manually set by you (I recommend 1.1.1.1/1.0.0.1), but others are truly recursive.

Take peppermint.mooncarrot.space. for example. If we had absolutely no idea where to start resolving this domain, we would first ask a DNS root server for help. Domain names are hierarchical in nature - sub.example.com. is a subdomain of example.com.. The same goes then that mooncarrot.space. is a subdomain of space., which is itself a subdomain of ., the DNS root zone. It is no accident that all the domain names in this blog post have a dot at the end of them (try entering starbeamrainbowlabs.com. into your browser, and watch as your browser auto-hides the trailing dot .).

In this way, if we know the IP address of a DNS root server (e.g. 193.0.14.129, or 2001:7fd::1), we can recurse through this hierarchical tree to discover the IP address associated with a domain name we want to resolve

First, we'd ask a root server to tell us the authoritative DNS server for the space. domain name. We do this by asking it for the NS record for the space. domain.

Once we know the address of the authoritative DNS server for space., we can ask it to give us the NS record for mooncarrot.space. for us. We may repeat this process a number of times - I'll omit the specific details of this for brevity (if anyone's interested, I can write a full deep dive post into this, how it works, and how it's kept secure - comment below) - and then we can finally ask the authoritative DNS server we've tracked down to resolve the domain name peppermint.mooncarrot.space. to an IP address for us (e.g. by asking for the associated A or AAAA record).

Authoritative DNS servers

With this in mind, we can now move on to the main purpose of this post: setting up an authoritative DNS server. As you might have guessed by now, the purpose of an authoritative DNS server is to hold records about 1 or more domain names.

While most of the time the authoritative DNS server for your domain name will be either your registrar or someone like Cloudflare, there are a number of circumstances in which it can be useful to run your own authoritative DNS server(s) and not rely on your registrar:

  • If you need more control over the DNS records served for your domain than your registrar provides
  • Serving complex DNS records for a domain name on an internal network (split-horizon DNS)
  • Setting up your own dynamic DNS system (i.e. where you dynamically update the IP address(es) that a domain name resolves to via an API call)

Other situations certainly exist, but these are 2 that come to mind at the moment (comment below if you have any other uses for authoritative DNS servers).

The specific situation I found myself was a combination of the latter 2 points here, so that's the context in which I'll be talking.

To set one up, we first need some software to do this. There are a number of DNS servers out there:

  • Bind9 [recursive; authoritative]
  • Unbound [recursive; not really authoritative; my favourite]
  • Dnsmasq [recursive]
  • systemd-resolved [recursive; it always breaks for me so I don't use it]

As mentioned Unbound is my favourite, so for this post I'll be showing you how to use it's equally cool sibling, nsd (Name Server Daemon).

The Name Server Daemon

Now that I've explained what an authoritative DNS server is and why it's important, I'll show you how to install and configure one, and then convince another recursive resolving DNS server that's under your control to ask your new authoritative DNS server instead of it's default upstream to resolve DNS queries for a given domain name.

It goes without saying that I'll be using Linux here. If you haven't already, I strongly recommend using Linux for hosting a DNS server (or any other kind of server). You'll have a bad day if you don't.

I will also be assuming that you have a level of familiarity with the Linux terminal. If you don't learn your terminal and then come back here.

nsd is available in all major distributions of Linux in the default repositories. Adjust as appropriate for your distribution:

sudo apt install nsd

nsd has 2 configuration files that are important. First is /etc/nsd/nsd.conf, which configures the nsd daemon itself. Let's do this one first. If there's an existing config file here, move it aside and then paste in something like this:

server:
    port: 5353

    server-count: 1
    username: nsd

    logfile: "/var/log/nsd.log"
    pidfile: "/run/nsd.pid"

    # The zonefile directive(s) below is prefixed by this path
    zonesdir: /etc/nsd/zones

zone:
    name: example.com
    zonefile: example.com.zone

...replace example.com with the domain name that you want the authoritative DNS server to serve DNS records for. You can also have multiple zone: blocks for different (sub)domains - even if those domain names are subdomains of others.

For example, I could have a zone: block for both example.com and dyn.example.com. This can be useful if you want to run your own dynamic DNS server, which will write out a full DNS zone file (a file that contains DNS records) without regard to any other DNS records that might have been in that DNS zone.

Replace also 5353 with the port you want nsd to listen on. In my case I have my authoritative DNS server running on the same box as the regular recursive resolver, so I've had to move the authoritative DNS server aside to a different port as dnsmasq (the recursive DNS server I have running on this particular box) has already taken port 53.

Next up, create the directory /etc/nsd/zones, and then open up example.com.zone for editing inside that new directory. In here, we will put the actual DNS records we want nsd to serve.

The format of this file is governed by RFC1035 section 5 and RFC1034 section 3.6.1, but the nsd docs provide a simpler example. See also the wikipedia page on DNS zone files.

Here's an example:

; example.com.
$TTL 300
example.com. IN     SOA    a.root-servers.net. admin.example.com. (
                2022090501  ; Serial
                3H          ; refresh after 3 hours
                1H          ; retry after 1 hour
                1W          ; expire after 1 week
                1D)         ; minimum TTL of 1 day

; Name Server
IN  NS  dns.example.com.

@                   IN A        5.196.73.75
example.com.        IN AAAA     2001:41d0:e:74b::1
www                 IN CNAME    @
ci                  IN CNAME    @

Some notes about the format to help you understand it:

  • Make sure ALL your fully-qualified domain names have the trailing dot at the end otherwise you'll have a bad day.
  • $TTL 300 specifies the default TTL (Time To Live, or the time DNS records can be cached for) in seconds for all subsequent DNS records.
  • Replace example.com. with your domain name.
  • admin.example.com. should be the email address of the person responsible for the DNS zone file, with the @ replaced with a dot instead.
  • dns.example.com. in the NS record must be set to the domain name of the authoritative DNS server serving the zone file.
  • @ IN A 5.196.73.75 is the format for defining an A record (see the introduction to this blog post) for example.com. - @ is automatically replaced with the domain name in question - in this case example.com.
  • When declaring a record, if you don't add the trailing dot then it is assumed you're referring to a subdomain of the domain this DNS zone file is for - e.g. if you put www it assumes you mean www.example.com.

Once you're done, all that's left for configuring nsd is to start it up for the first time (and on boot). Do that like so:

sudo systemctl restart nsd
sudo systemctl enable nsd

Now, you should be able to query it to test it. I like to use dig for this:

dig -p 5353 +short @dns.example.com example.com

...this should return a result based on the DNS zone file you defined above. Replace 5353 with the port number your authoritative DNS server is running on, or omit -p 5353 altogether if it's running on port 53.

Try it out by updating your DNS zone file and reloading nsd: sudo systemctl reload nsd

Congratulations! You now have an authoritative DNS server under your control! This does not mean that it will be queried by any other DNS servers on your network though - read on.....

Integration with the rest of your network

The final part of this post will cover integrating an authoritative DNS server with another DNS server on your network - usually a recursive one. How you do this will vary depending on the target DNS server you want to convince to talk to your authoritative DNS server.

For Unbound:

I've actually covered this in a previous blog post. Simply update /etc/unbound/unbound.conf with a new block like this:

forward-zone:
    name: "example.com."
    forward-addr: 127.0.0.1@5353

...where example.com. is the domain name to forward for (WITH THE TRAILING DOT; and all subdomains thereof), 127.0.0.1 is the IP address of the authoritative DNS server, and 5353 is the port number of the authoritative DNS server.

Then, restart Unbound like so:

sudo systemctl restart unbound

For dnsmasq:

Dnsmasq's main config file is located at /etc/dnsmasq.conf, but there may be other config files located in /etc/dnsmasq.d/ that might interfere. Either way, update dnsmasq's config file with this directive:

server=/example.com./127.0.0.1#5353

...where example.com. is the domain name to forward for (WITH THE TRAILING DOT; and all subdomains thereof), 127.0.0.1 is the IP address of the authoritative DNS server, and 5353 is the port number of the authoritative DNS server.

If there's another server=/example.com./... directive elsewhere in your dnsmasq config, it may override your new definition.

Then, restart dnsmasq like so:

sudo systemctl restart dnsmasq

If there's another DNS server that I haven't included here that you use, please leave a comment on how to reconfigure it to forward a specific domain name to a different DNS server.

Conclusion

In this post, I've talked about the difference between an authoritative DNS server and a recursive resolving DNS server. I've shown why authoritative DNS servers are useful, and alluded to reasons why running your own authoritative DNS server can be beneficial.

In the second post in this 2-part miniseries, I'm going to go into detail on dynamic DNS, why it's useful, and how to set up a dynamic dns server.

As always, this blog post is a starting point - not an ending point. DNS is a surprisingly deep subject: from DNS root hint files to mDNS (multicast DNS) to the various different DNS record types, there are many interesting and useful things to learn about it.

After all, it's always DNS..... especially when you don't think it is.

Sources and further reading

Cluster Series List

Hey there - I hope you've had a happy Christmas and a great new year! I'm writing this before both, so I'm still looking forward to it :D

It's been a while since I did a series list here. This time, it's for my Cluster series of blog posts, which has reached 11 parts so far! Despite publishing this series list, this series is not over! I intend to post more parts to this series in the future. I'm just posting this as a convenient place to point people interested in my cluster setup.

For those not in the know, I have a cluster of Raspberry Pis which provide my primary source of compute power for continually running services on my home network - and this series documents how I've got it setup. I also have a NAS which provides redundant (and backed up, of course) storage (see here for the setup, and here for the backups), and finally a GPU server which when I look back at the archives I apparently have yet to blog about - oops!

Here's the full series list - I'll update this list as I post more parts to this series.

If I forget to update this post, please do get in touch by leaving a comment below. You can also find all the posts related to my cluster by looking at the cluster tag here on my blog.

lnav basics tutorial

Last year, I blogged about lnav. lnav is a fantastic tool for analysing log files, and after getting a question from CrimsonTome I thought I'd write up a longer-form tutorial on the basics of using it, as I personally find it exceedingly useful.

A screenshot of lnav at work

I'll be using an Ubuntu Server 20.04 instance for this tutorial, but anything Linuxy will work just fine. As mentioned in my previous post, it's available in the default repositories for your distribution. For apt-based systems, install like so:

sudo apt install lnav

Adjust for your own package manager. For example, pacman-based distributions should do this:

sudo pacman -S lnav

lnav operates on 1 or more input files. It's common to use logrotate to rotate log files, so this is what I'd recommend to analyse all your logs of a particular type in 1 go (here I analyse generic syslog logs):

lnav /var/log/syslog*

On your system you may need to sudo that. Once you've got lnav started, you may need to wait a moment for it to parse all the log files - especially if you have multi-million line logfiles.

After it's finished loading, we can get to analysing the logs at hand. The most recent logs appear at the bottom, and you'll notice that lnav will have coloured various parts of each log message - the reason for this will become apparently later on. lnav should also livestream log lines from disk too.

Use the arrow keys or scroll up / down to navigate log messages.

lnav operates via a command pallette system, which if you use GitHub's [Atom IDE] (https://atom.io/) or Sublime Text (which is apparently where the feature originated) may already be familiar to you. In lnav's case, it's also crossed with a simple shell. Let's start with the most important command: :filter-out.

To execute a command, simply start typing. Commands in lnav are prefixed with a colon :. :filter-out takes a regular expression as it's only argument and filters all log lines which match the given regular expression out and hides them. Sticking with our earlier syslog theme, here's an example:

:filter-out kernel:

You'll notice that once you've finished typing :filter-out, lnav will show you some help in a pane at the bottom of the screen showing you how to use that command.

:filter-out has a twin that's also useful to remember: :filter-in. Unlike :filter-out, :filter-in does the opposite - anything that doesn't match the specified pattern is hidden from view. Very useful if you know what kind of log messages you're looking for, and they are a (potentially very small) subset of a much larger and more unstructured log file.

:filter-in dovecot:

To delete all existing filters and reset the view, hit Ctrl + R.

lnav has many other built-in commands. Check out the full reference here: https://docs.lnav.org/en/latest/commands.html.

The other feature that lnav comes with is also the most powerful: SQLite3 support. By parsing common log file formats (advanced users can extend lnav by defining their own custom formats, but the specifics of how to do this are best left to the lnav documentation), it can enable you to query your log files by writing arbitrary SQLite queries!

To understand how to query a file, first hit the p key. This will show you how lnav has parsed the log line at the top of the screen (scroll as normal to look at different lines, and hit p again to hide). Here's an example:

Using this information, we can then make an SQL query against the data. Press semicolon ; to open the SQL query prompt, and then enter something like this:

SELECT * FROM syslog_log WHERE log_procname == "gitea";

....hit the enter key when you're done composing your query, and the results should then appear! You can scroll through them just like you do with the regular log viewer - you just can't use :filter-in and :filter-out until you leave the query results window with the q key (this would be a really useful feature though!).

If you're running lnav on your Nginx logs (located in /var/log/nginx/ by default), then I find this query to be of particular use:

SELECT COUNT(cs_referer) AS count, cs_referer FROM access_log GROUP BY cs_referer ORDER BY COUNT(cs_referer) DESC

That concludes this basic tutorial on lnav. There are many more features that lnav offers:

  • :filter-expr for filtering the main view by SQL query
  • Analysing files on remote hosts over SSH
  • Search logs for a given string (press / and start typing)
  • Too many others to list here

Check out the full documentation here: https://docs.lnav.org/

Tips for training (large numbers of) AI models

As part of my PhD, I'm training AI models. The specifics as to what for don't particularly matter for this post (though if you're curious I recommend my PhD update blog post series). Over the last year or so, I've found myself training a lot of AI models, and dealing with a lot of data. In this post, I'm going to talk about some of the things I've found helpful and some of the things things I've found that are best avoided. Note that this is just a snapshot of my current practices now - this will probably gradually change over time.

I've been working with Tensorflow.js and Tensorflow for Python on various Linux systems. If you're on another OS or not working with AI then what I say here should still be somewhat relevant.

Datasets

First up: a quick word on datasets. While this post is mainly about AI models, datasets are important too. Keeping them organised is vitally important. Keeping all the metadata that associated with them is also vitally important. Keeping a good directory hierarchy is the best way to achieve this.

I also recommend sticking with a standard format that's easy to parse using your preferred language - and preferably lots of other languages too. Json Lines is my personal favourite format for data - potentially compressed with Gzip if the filesize of is very large.

AI Models

There are multiple facets to the problem of wrangling AI models:

  1. Code that implements the model itself and supporting code
  2. Checkpoints from the training process
  3. Analysis results from analysing such models

All of these are important for different reasons - and are also affected by where it is that you're going to be training your model.

By far the most important thing I recommend doing is using Git with a remote such as GitHub and committing regularly. I can't stress enough how critical this is - it's the best way to both keep a detailed history of the code you've written and keep a backup at the same time. It also makes working on multiple computers easy. Getting into the habit of using Git for any project (doesn't matter what it is) will make your life a lot easier. At the beginning of a programming session, pull down your changes. Then, as you work, commit your changes and describe them properly. Finally, push your changes to the remote after committing to keep them backed up.

Coming in at a close second is implementing is a command line interface with the ability to change the behaviour of your model. This includes:

  • Setting input datasets
  • Specifying output directories
  • Model hyperparameters (e.g. input size, number of layers, number of units per layer, etc)

This is invaluable for running many different variants of your model quickly to compare results. It is also very useful when training your model in headless environments, such as on High Performance Computers (HPCs) such as Viper that my University has.

For HPCs that use Slurm, a great tip here is that when you call sbatch on your job file (e.g. sbatch path/to/jobfile.job), it will preserve your environment. This lets you pass in job-specific parameters by writing a script like this:

#!/usr/bin/env bash
#SBATCH -J TwImgCCT
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --gres=gpu:1
#SBATCH -o %j.%N.%a.out
#SBATCH -e %j.%N.%a.err
#SBATCH -p gpu05,gpu
#SBATCH --time=5-00:00:00
#SBATCH --mem=25600
# 25600 = 25GiB memory required

# Viper use Trinity ClusterVision: https://clustervision.com/trinityx-cluster-management/ and https://github.com/clustervision/trinityX
module load utilities/multi
module load readline/7.0
module load gcc/10.2.0
module load cuda/11.5.0

module load python/anaconda/4.6/miniconda/3.7

echo ">>> Installing requirements";
conda run -n py38 pip install -r requirements.txt;
echo ">>> Training model";
/usr/bin/env time --verbose conda run -n py38 src/my_model.py ${PARAMS}
echo ">>> exited with code $?";

....which you can call like so:

PARAMS="--size 4 --example 'something else' --input path/to/file --output outputs/20211002-resnet" sbatch path/to/jobfile.job

You may end up finding you have rather a lot of code behind your model - especially for data preprocessing depending on your dataset. To handle this, I go by 2 rules of thumb:

  1. If a source file of any language is more than 300 lines long, it should be split into multiple files
  2. If a collection of files do a thing together rather nicely, they belong in a separate Git repository.

To elaborate on these, having source code files become very long makes them difficult to maintain, understand, and re-use in future projects. Splitting them up makes your life much easier.

Going further, modularising your code is also an amazing paradigm to work with. I've broken many parts of my various codebases I've implemented for my PhD out as open-source projects on npm (the Node Package Manager) - most notably applause-cli, terrain50, terrain50-cli, nimrod-data-downloader, and twitter-academic-downloader.

By making them open-source, I'm not only making my research and methods more transparent and easier for others to independently verify, but I'm also allowing others to benefit from them (and potentially improve them) too! As they say, there's no need to re-invent the wheel.

Eventually, I will be making the AI models I'm implementing for my PhD open-source too - but this will take some time as I want to ensure that the models actually work before doing so (I've got 1 model I implemented fully and documented too, but in the end it has a critical bug that means the whole thing is useless.....).

Saving checkpoints from the training process of your model is also essential. I recommend doing so at the end of each epoch. As part of this, it's also useful to have a standard format for your output artefacts from the training process. Ideally, these artefacts can be used to identify precisely what dataset and hyperparameters that model and checkpoints were trained with.

At the moment, my models output something like this:

+ output_dir/
    + summary.txt       Summary of the layers of the model and their output shapes
    + metrics.tsv       TSV file containing training/validation loss/accuracy and epoch numbers
    + settings.toml     The TOML settings that the model was trained with
    + checkpoints/      Directory containing the checkpoints - 1 per epoch
        + checkpoint_e1_val_acc0.699.hdf5   Example checkpoint filename [Tensorflow for Python]
        + 0/            OR, if using Tensorflow.js instead of Tensorflow for Python, 1 directory per checkpoint
    + this_run.log      Logfile for this run [depends on where the program is being executed]

settings.toml leads me on to settings files. Personally I use TOML for mine, and I use 2 files:

  • settings.default.toml - Contains all the default values of the settings, and is located alongside the code for my model
  • example.toml - Custom settings that override values in the default settings file can be specified using my standard --config CLI argument.

Having a config file is handy when you have multiple dataset input files that rarely change. Generally speaking you want to ensure that you minimise the number of CLI arguments that you have to specify when running your model, as then it reduces cognitive load when you're training many variants of a model at once (I've found that wrangling dozens of different dataset files and model variants is hard enough to focus on and keep organised :P).

Analysis results are the final aspect here that it's important to keep organised - and the area in which I have the least experience. I've found it's important to keep track of which model checkpoint it was that the analysis was done with and which dataset said model was trained on. Keeping the entire chain of dataflow clear and easy to follow is difficult because the analysis one does is usually ad-hoc, and often has to be repeated many times on different model variants.

For this, so far I generate statistics and some graphs on the command line. If you're not already familiar with the terminal / command line of your machine, I can recommend checking out my earlier post Learn Your Terminal, which has a bunch of links to tutorials for this. In addition, jq is an amazing tool for manipulating JSON data. It's not installed by default on most systems, but it's available in most default repositories and well worth the install.

For some graphs, I use Gnuplot. Usually though this is only for more complex plots, as it takes a moment to write a .plt file to generate the graph I want in it.

I'm still looking for a good tool that makes it easy to generate basic graphs from the command line, so please get in touch if you've found one.

I'm also considering integrating some of the basic analysis into my model training program itself, such that it generates e.g. confusion matrices automatically as part of the training process. matplotlib seems to do the job here for plotting graphs in Python, but I have yet to find an equivalent library for Javascript. Again, if you've found one please get in touch by leaving a comment below.

Conclusion

In this post, I've talked about some of the things I've found helpful so far while I've been training models. From using Git to output artefacts to implementing command line interfaces and wrangling datasets, implementing the core AI model itself is actually only a very small part of an AI project.

Hopefully this post has given you some insight into the process of developing an AI model / AI-powered system. While I've been doing some of these things since before I started my PhD (like Git), others have taken me a while to figure out - so I've noted them down here so that you don't have to spend ages figuring out the same things!

If you've got some good tips you'd like to share on developing AI models (or if you've found the tips here in this blog post helpful!), please do share them below.

Securing your port-forwarded reverse proxy

Recently, I answered a question on Reddit about reverse proxies, and said answer was long enough and interesting enough to be tidied up and posted here.

The question itself is concerning port forwarded reverse proxies and internal services:

Hey everyone, I've been scratching my head over this for a while.

If I have internal services which I've mapped a subdomain like dashboard.domain.com through NGINX but haven't enabled the CNAME on my DNS which would map my dashboard.domain.com to my DDNS.

To me this seems like an external person can't access my service because dashboard.domain.com wouldn't resolve to an IP address but I'm just trying to make sure that this is the case.

For my internal access I have a local DNS that maps my dashboard.domain.com to my NGINX.

Is this right?

--u/Jhonquil

So to answer this question, let's first consider an example network architecture:

So we have a router sitting between the Internet and a server running Nginx.

Let's say you've port forwarded to your Nginx instance on 80 & 443, and Nginx serves 2 domains: wiki.bobsrockets.com and dashboard.bobsrockets.com. wiki.bobsrockets.com might resolve both internally and externally for example, while dashboard.bobsrockets.com may only resolve internally.

In this scenario, you might think that dashboard.bobsrockets.com is safe from people accessing it outside, because you can't enter dashboard.bobsrockets.com into a web browser from outside to access it.

Unfortunately, that's not true. Suppose an attacker catches wind that you have an internal service called dashboard.bobsrockets.com running (e.g. through crt.sh, which makes certificate transparency logs searchable). With this information, they could for example modify the Host header of a HTTP request like this with curl:

curl --header "Host: dashboard.bobsrockets.com" http://wiki.bobsrockets.com/

....which would cause Nginx to return dashboard.bobsrockets.com to the external attacker! The same can also be done with HTTPS with a bit more work.

That's no good. To rectify this, we have 2 options. The first is to run 2 separate reverse proxies, with all the internal-only content on the first and the externally-viewable stuff on the second. Most routers that offer the ability to port forward also offer the ability to do transparent port translation too, so you could run your external reverse proxy on ports 81 and 444 for example.

This can get difficult to manage though, so I recommend the following:

  1. Force redirect to HTTPS
  2. Then, use HTTP Basic Authentication like so:
server {
    # ....
    satisfy any;
    allow   192.168.0.0/24; # Your internal network IP address block
    allow   10.31.0.0/16; # Multiple blocks are allowed
    deny    all;
    auth_basic              "Example";
    auth_basic_user_file    /etc/nginx/.passwds;

    # ....
}

This allows connections from your local network through no problem, but requires a username / password for access from outside.

For your internal services, note that you can get a TLS certificate for HTTPS for services that run inside by using Let's Encrypt's DNS-01 challenge. No outside access is required for your internal services, as the DNS challenge is completed by automatically setting (and then removing again afterwards) a DNS record, which proves that you have ownership of the domain in question.

Just because a service is running on your internal network doesn't mean to say that running HTTPS isn't a good idea - defence in depth is absolutely a good idea.

NAS Backups, Part 2: Btrfs send / receive

Hey there! In the first post of this series, I talked about my plan for a backup NAS to complement my main NAS. In this part, I'm going to show the pair of scripts I've developed to take care of backing up btrfs snapshots.

The first script is called snapshot-send.sh, and it:

  1. Calculates which snapshot it is that requires sending
  2. Uses SSH to remote into the backup NAS
  3. Pipes the output of btrfs send to snapshot-receive.sh on the backup NAS that is called with sudo

Note there that while sudo is used for calling snapshot-receive.sh, the account it uses to SSH into the backup NAS, it doesn't have completely unrestricted sudo access. Instead, a sudo rule is used to restrict it to allow only specific commands to be called (without a password, as this is intended to be a completely automated and unattended system).

The second script is called snapshot-receive.sh, and it receives the output of btrfs send and pipes it to btrfs receive. It also has some extra logic to delete old snapshots and stuff like that.

Both of these are designed to be command line programs in their own right with a simple CLI, and useful error / help messages to assist in understanding it when I come back to it to fix an issue or extend it after many months.

snapshot-send.sh

As described above, snapshot-send.sh sends btrfs snapshot to a remote host via SSH and the snapshot-receive.sh script.

Before we continue and look at it in detail, it is important to note that snapshot-send.sh depends on btrfs-snapshot-rotation. If you haven't already done so, you should set that up first before setting up my scripts here.

If you have btrfs-snapshot-rotation setup correctly, you should have something like this in your crontab:

# Btrfs automatic snapshots
0 * * * *       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots hourly 8
0 2 * * *       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots daily 4
0 2 * * 7       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots weekly 4

I use cronic there to reduce unnecessary emails. I also have a subvolume there for the snapshots:

sudo btrfs subvolume create /mnt/some_btrfs_filesystem/main/.snapshots

Because Btrfs does not take take a snapshot of any child subvolumes when it takes a snapshot, I can use this to keep all my snapshots organised and associated with the subvolume they are snapshots of.

If done right, ls /mnt/some_btrfs_filesystem/main/.snapshots should result in something like this:

2021-07-25T02:00:01+00:00-@weekly  2021-08-17T07:00:01+00:00-@hourly
2021-08-01T02:00:01+00:00-@weekly  2021-08-17T08:00:01+00:00-@hourly
2021-08-08T02:00:01+00:00-@weekly  2021-08-17T09:00:01+00:00-@hourly
2021-08-14T02:00:01+00:00-@daily   2021-08-17T10:00:01+00:00-@hourly
2021-08-15T02:00:01+00:00-@daily   2021-08-17T11:00:01+00:00-@hourly
2021-08-15T02:00:01+00:00-@weekly  2021-08-17T12:00:01+00:00-@hourly
2021-08-16T02:00:01+00:00-@daily   2021-08-17T13:00:01+00:00-@hourly
2021-08-17T02:00:01+00:00-@daily   last_sent_@daily.txt
2021-08-17T06:00:01+00:00-@hourly

Ignore the last_sent_@daily.txt there for now - it's created by snapshot-send.sh so that it can remember the name of the snapshot it last sent. We'll talk about it later.

With that out of the way, let's start going through snapshot-send.sh! First up is the CLI and associated error handling:

#!/usr/bin/env bash
set -e;

dir_source="${1}";
tag_source="${2}";
tag_dest="${3}";
loc_ssh_key="${4}";
remote_host="${5}";

if [[ -z "${remote_host}" ]]; then
    echo "This script sends btrfs snapshots to a remote host via SSH.
The script snapshot-receive must be present on the remote host in the PATH for this to work.
It pairs well with btrfs-snapshot-rotation: https://github.com/mmehnert/btrfs-snapshot-rotation
Usage:
    snapshot-send.sh <snapshot_dir> <source_tag_name> <dest_tag_name> <ssh_key> <user@example.com>

Where:
    <snapshot_dir> is the path to the directory containing the snapshots
    <source_tag_name> is the tag name to look for (see btrfs-snapshot-rotation).
    <dest_tag_name> is the tag name to use when sending to the remote. This must be unique across all snapshot rotations sent.
    <ssh_key> is the path to the ssh private key
    <user@example.com> is the user@host to connect to via SSH" >&2;
    exit 0;
fi

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ ! -e "${loc_ssh_key}" ]]; then
    echo "Error: When looking for the ssh key, no file was found at '${loc_ssh_key}' (have you checked the spelling and file permissions?)." >&2;
    exit 1;
fi
if [[ ! -d "${dir_source}" ]]; then
    echo "Error: No source directory located at '${dir_source}' (have you checked the spelling and permissions?)" >&2;
    exit 2;
fi

###############################################################################

Pretty simple stuff. snapshot-send.sh is called like so:

snapshot-send.sh /absolute/path/to/snapshot_dir SOURCE_TAG DEST_TAG_NAME path/to/ssh_key user@example.com

A few things to unpack here.

  • /absolute/path/to/snapshot_dir is the path to the directory (i.e. btrfs subvolume) containing the snapshots we want to read, as described above.
  • SOURCE_TAG: Given the directory (subvolume) name of a snapshot (e.g. 2021-08-17T02:00:01+00:00-@daily), then the source tag is the bit at the end after the at sign @ - e.g. daily.
  • DEST_TAG_NAME: The tag name to give the snapshot on the backup NAS. Useful, because you might have multiple subvolumes you snapshot with btrfs-snapshot-rotation and they all might have snapshots with the daily tag.
  • path/to/ssh_key: The path to the (unencrypted!) SSH key to use to SSH into the remote backup NAS.
  • user@example.com: The user and hostname of the backup NAS to SSH into.

This is a good time to sort out the remote user we're going to SSH into (we'll sort out snapshot-receive.sh and the sudo rules in the next section below).

Assuming that you already have a Btrfs filesystem setup and automounting on boot on the remote NAS, do this:


sudo useradd --system --home /absolute/path/to/btrfs-filesystem/backups backups
sudo groupadd backup-senders
sudo usermod -a -G backup-senders backups
cd /absolute/path/to/btrfs-filesystem/backups
sudo mkdir .ssh
sudo touch .ssh/authorized_keys
sudo chown -R backups:backups .ssh
sudo chmod -R u=rwX,g=rX,o-rwx .ssh

Then, on the main NAS, generate the SSH key:


mkdir -p /root/backups && cd /root/backups
ssh-keygen -t ed25519 -C backups@main-nas -f /root/backups/ssh_key_backup_nas_ed25519

Then, copy the generated SSH public key to the authorized_keys file on the backup NAS (located at /absolute/path/to/btrfs-filesystem/backups/.ssh/authorized_keys).

Now that's sorted, let's continue with snapshot-send.sh. Next up are a few miscellaneous functions:


# The filepath to the last sent text file that contains the name of the snapshot that was last sent to the remote.
# If this file doesn't exist, then we send a full snapshot to start with.
# We need to keep track of this because we need this information to know which
# snapshot we need to parent the latest snapshot from to send snapshots incrementally.
filepath_last_sent="${dir_source}/last_sent_@${tag_source}.txt";

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

## Lists all the currently available snapshots for the current source tag.
list_snapshots() {
    find "${dir_source}" -maxdepth 1 ! -path "${dir_source}" -name "*@${tag_source}" -type d;
}

## Returns an exit code of 0 if we've sent a snapshot, or 1 if we haven't.
have_sent() {
    if [[ ! -f "${filepath_last_sent}" ]]; then
        return 1;
    else
        return 0;
    fi
}

## Fetches the directory name of the last snapshot sent to the remote with the given tag name.
last_sent() {
    if [[ -f "${filepath_last_sent}" ]]; then
        cat "${filepath_last_sent}";
    fi
}

# Runs snapshot-receive on the remote host.
do_ssh() {
    ssh -o "ServerAliveInterval=900" -i "${loc_ssh_key}" "${remote_host}" sudo snapshot-receive "${tag_dest}";
}

Particularly of note is the filepath_last_sent variable - this is set to the path to that text file I mentioned earlier.

Other than that it's all pretty well commented, so let's continue on. Next, we need to determine the name of the latest snapshot:

latest_snapshot="$(list_snapshots | sort | tail -n1)";
latest_snapshot_dirname="$(dirname "${latest_snapshot}")";

With this information in hand we can compare it to the last snapshot name we sent. We store this in the text file mentioned above - the path to which is stored in the filepath_last_sent variable.

if [[ "$(dirname "${latest_snapshot_dirname}")" == "$(cat "${filepath_last_sent}")" ]]; then
    if [[ -z "${FORCE_SEND}" ]]; then
        echo "We've sent the latest snapshot '${latest_snapshot_dirname}' already and the FORCE_SEND environment variable is empty or not specified, skipping";
        exit 0;
    else
        echo "We've sent it already, but sending it again since the FORCE_SEND environment variable is specified";
    fi
fi

If the latest snapshot has the same name as the one we last send, we exit out - unless the FORCE_SEND environment variable is specified (to allow for an easy way to fix stuff if it goes wrong on the other end).

Now, we can actually send the snapshot to the remote:


if ! have_sent; then
    log_msg "Sending initial snapshot $(dirname "${latest_snapshot}")";
    btrfs send "${latest_snapshot}" | do_ssh;
else
    parent_snapshot="${dir_source}/$(last_sent)";
    if [[ ! -d "${parent_snapshot}" ]]; then
        echo "Error: Failed to locate parent snapshot at '${parent_snapshot}'" >&2;
        exit 3;
    fi

    log_msg "Sending incremental snapshot $(dirname "${latest_snapshot}") parent $(last_sent)";
    btrfs send -p "${parent_snapshot}" "${latest_snapshot}" | do_ssh;
fi

have_sent simply determines if we have previously sent a snapshot before. We know this by checking the filepath_last_sent text file.

If we haven't, then we send a full snapshot rather than an incremental one. If we're sending an incremental one, then we find the parent snapshot (i.e. the one we last sent). If we can't find it, we generate an error (it's because of this that you need to store at least 2 snapshots at a time with btrfs-snapshot-rotation).

After sending a snapshot, we need to update the filepath_last_sent text file:

log_msg "Updating state information";
basename "${latest_snapshot}" >"${filepath_last_sent}";
log_msg "Snapshot sent successfully";

....and that concludes snapshot-send.sh! Once you've finished reading this blog post and testing your setup, put your snapshot-send.sh calls in a script in /etc/cron.daily or something.

snapshot-receive.sh

Next up is the receiving end of the system. The CLI for this script is much simpler, on account of sudo rules only allowing exact and specific commands (no wildcards or regex of any kind). I put snapshot-receive.sh in /usr/local/sbin and called it snapshot-receive.

Let's get started:

#!/usr/bin/env bash

# This script wraps btrfs receive so that it can be called by non-root users.
# It should be saved to '/usr/local/sbin/snapshot-receive' (without quotes, of course).
# The following entry needs to be put in the sudoers file:
# 
# %backup-senders   ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME
# 
# ....replacing TAG_NAME with the name of tag you want to allow. You'll need 1 line in your sudoers file per tag you want to allow.
# Edit your sudoers file like this:
# sudo visudo

# The ABSOLUTE path to the target directory to receive to.
target_dir="CHANGE_ME";

# The maximum number of backups to keep.
max_backups="7";

# Allow only alphanumeric characters in the tag
tag="$(echo "${1}" | tr -cd '[:alnum:]-_')";

snapshot-receive.sh only takes a single argument, and that's the tag it should use for the snapshot being received:


sudo snapshot-receive DEST_TAG_NAME

The target directory it should save snapshots to is stored as a variable at the top of the file (the target_dir there). You should change this based on your specific setup. It goes without saying, but the target directory needs to be a directory on a btrfs filesystem (preferable raid1, though as I've said before btrfs raid1 is a misnomer). We also ensure that the tag contains only safe characters for security.

max_backups is the maximum number of snapshots to keep. Any older snapshots will be deleted.

Next, ime error handling:

###############################################################################

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ -z "${tag}" ]]; then
    echo "Error: No tag specified. It should be specified as the 1st and only argument, and may only contain alphanumeric characters." >&2;
    echo "Example:" >&2;
    echo "    snapshot-receive TAG_NAME_HERE" >&2;
    exit 4;
fi

Nothing too exciting. Continuing on, a pair of useful helper functions:


###############################################################################

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

list_backups() {
    find "${target_dir}/${tag}" -maxdepth 1 ! -path "${target_dir}/${tag}" -type d;
}

list_backups lists the snapshots with the given tag, and log_msg logs messages to stdout (not stderr unless there's an error, because otherwise cronic will dutifully send you an email every time the scripts execute). Next up, more error handling:

###############################################################################

if [[ "${target_dir}" == "CHANGE_ME" ]]; then
    echo "Error: target_dir was not changed from the default value." >&2;
    exit 1;
fi

if [[ ! -d "${target_dir}" ]]; then
    echo "Error: No directory was found at '${target_dir}'." >&2;
    exit 2;
fi

if [[ ! -d "${target_dir}/${tag}" ]]; then
    log_msg "Creating new directory at ${target_dir}/${tag}";
    mkdir "${target_dir}/${tag}";
fi

We check:

  • That the target directory was changed from the default CHANGE_ME value
  • That the target directory exists

We also create a subdirectory for the given tag if it doesn't exist already.

With the preamble completed, we can actually receive the snapshot:

log_msg "Launching btrfs in chroot mode";

time nice ionice -c Idle btrfs receive --chroot "${target_dir}/${tag}";

We use nice and ionice to reduce the priority of the receive to the lowest possible level. If you're using a Raspberry Pi (I have a Raspberry Pi 4 with 4GB RAM) like I am, this is important for stability (Pis tend to fall over otherwise). Don't worry if you experience some system crashes on your Pi when transferring the first snapshot - I've found that incremental snapshots don't cause the same issue.

We also use the chroot option there for increased security.

Now that the snapshot is transferred, we can delete old snapshots if we have too many:

backups_count="$(echo -e "$(list_backups)" | wc -l)";

log_msg "Btrfs finished, we now have ${backups_count} backups:";
list_backups;

while [[ "${backups_count}" -gt "${max_backups}" ]]; do
    oldest_backup="$(list_backups | sort | head -n1)";
    log_msg "Maximum number backups is ${max_backups}, requesting removal of backup for $(dirname "${oldest_backup}")";

    btrfs subvolume delete "${oldest_backup}";

    backups_count="$(echo -e "$(list_backups)" | wc -l)";
done

log_msg "Done, any removed backups will be deleted in the background";

Sorted! The only thing left to do here is to setup those sudo rules. Let's do that now. Execute sudoedit /etc/sudoers, and enter the following:

%backup-senders ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME

Replace TAG_NAME with the DEST_TAG_NAME you're using. You'll need 1 entry in /etc/sudoers for each DEST_TAG_NAME you're using.

We assign the rights to the backup-senders group we created earlier, of which the user we are going to SSH in with is a member. This make the system more flexible should we want to extend it later.

Warning: A mistake in /etc/sudoers can leave you unable to use sudo! Make sure you have a root shell open in the background and that you test sudo again after making changes to ensure you haven't made a mistake.

That completes the setup of snapshot-receive.sh.

Conclusion

With snapshot-send.sh and snapshot-receive.sh, we now have a system for transferring snapshots from 1 host to another via SSH. If combined with full disk encryption (e.g. with LUKS), this provides a secure backup system with a number of desirable qualities:

  • The main NAS can't access the backups on the backup NAS (in case fo ransomware)
  • Backups are encrypted during transfer (via SSH)
  • Backups are encrypted at rest (LUKS)

To further secure the backup NAS, one could:

  • Disable SSH password login
  • Automatically start / shutdown the backup NAS (though with full disk encryption when it boots up it would require manual intervention)

At the bottom of this post I've included the full scripts for you to copy and paste.

As it turns out, there will be 1 more post in this series, which will cover generating multiple streams of backups (e.g. weekly, monthly) from a single stream of e.g. daily backups on my backup NAS.

Sources and further reading

Full scripts

snapshot-send.sh

#!/usr/bin/env bash
set -e;

dir_source="${1}";
tag_source="${2}";
tag_dest="${3}";
loc_ssh_key="${4}";
remote_host="${5}";

if [[ -z "${remote_host}" ]]; then
    echo "This script sends btrfs snapshots to a remote host via SSH.
The script snapshot-receive must be present on the remote host in the PATH for this to work.
It pairs well with btrfs-snapshot-rotation: https://github.com/mmehnert/btrfs-snapshot-rotation
Usage:
    snapshot-send.sh <snapshot_dir> <source_tag_name> <dest_tag_name> <ssh_key> <user@example.com>

Where:
    <snapshot_dir> is the path to the directory containing the snapshots
    <source_tag_name> is the tag name to look for (see btrfs-snapshot-rotation).
    <dest_tag_name> is the tag name to use when sending to the remote. This must be unique across all snapshot rotations sent.
    <ssh_key> is the path to the ssh private key
    <user@example.com> is the user@host to connect to via SSH" >&2;
    exit 0;
fi

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ ! -e "${loc_ssh_key}" ]]; then
    echo "Error: When looking for the ssh key, no file was found at '${loc_ssh_key}' (have you checked the spelling and file permissions?)." >&2;
    exit 1;
fi
if [[ ! -d "${dir_source}" ]]; then
    echo "Error: No source directory located at '${dir_source}' (have you checked the spelling and permissions?)" >&2;
    exit 2;
fi

###############################################################################

# The filepath to the last sent text file that contains the name of the snapshot that was last sent to the remote.
# If this file doesn't exist, then we send a full snapshot to start with.
# We need to keep track of this because we need this information to know which
# snapshot we need to parent the latest snapshot from to send snapshots incrementally.
filepath_last_sent="${dir_source}/last_sent_@${tag_source}.txt";

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

## Lists all the currently available snapshots for the current source tag.
list_snapshots() {
    find "${dir_source}" -maxdepth 1 ! -path "${dir_source}" -name "*@${tag_source}" -type d;
}

## Returns an exit code of 0 if we've sent a snapshot, or 1 if we haven't.
have_sent() {
    if [[ ! -f "${filepath_last_sent}" ]]; then
        return 1;
    else
        return 0;
    fi
}

## Fetches the directory name of the last snapshot sent to the remote with the given tag name.
last_sent() {
    if [[ -f "${filepath_last_sent}" ]]; then
        cat "${filepath_last_sent}";
    fi
}

do_ssh() {
    ssh -o "ServerAliveInterval=900" -i "${loc_ssh_key}" "${remote_host}" sudo snapshot-receive "${tag_dest}";
}

latest_snapshot="$(list_snapshots | sort | tail -n1)";
latest_snapshot_dirname="$(dirname "${latest_snapshot}")";

if [[ "$(dirname "${latest_snapshot_dirname}")" == "$(cat "${filepath_last_sent}")" ]]; then
    if [[ -z "${FORCE_SEND}" ]]; then
        echo "We've sent the latest snapshot '${latest_snapshot_dirname}' already and the FORCE_SEND environment variable is empty or not specified, skipping";
        exit 0;
    else
        echo "We've sent it already, but sending it again since the FORCE_SEND environment variable is specified";
    fi
fi

if ! have_sent; then
    log_msg "Sending initial snapshot $(dirname "${latest_snapshot}")";
    btrfs send "${latest_snapshot}" | do_ssh;
else
    parent_snapshot="${dir_source}/$(last_sent)";
    if [[ ! -d "${parent_snapshot}" ]]; then
        echo "Error: Failed to locate parent snapshot at '${parent_snapshot}'" >&2;
        exit 3;
    fi

    log_msg "Sending incremental snapshot $(dirname "${latest_snapshot}") parent $(last_sent)";
    btrfs send -p "${parent_snapshot}" "${latest_snapshot}" | do_ssh;
fi


log_msg "Updating state information";
basename "${latest_snapshot}" >"${filepath_last_sent}";
log_msg "Snapshot sent successfully";

snapshot-receive.sh

#!/usr/bin/env bash

# This script wraps btrfs receive so that it can be called by non-root users.
# It should be saved to '/usr/local/sbin/snapshot-receive' (without quotes, of course).
# The following entry needs to be put in the sudoers file:
# 
# %backup-senders   ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME
# 
# ....replacing TAG_NAME with the name of tag you want to allow. You'll need 1 line in your sudoers file per tag you want to allow.
# Edit your sudoers file like this:
# sudo visudo

# The ABSOLUTE path to the target directory to receive to.
target_dir="CHANGE_ME";

# The maximum number of backups to keep.
max_backups="7";

# Allow only alphanumeric characters in the tag
tag="$(echo "${1}" | tr -cd '[:alnum:]-_')";

###############################################################################

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ -z "${tag}" ]]; then
    echo "Error: No tag specified. It should be specified as the 1st and only argument, and may only contain alphanumeric characters." >&2;
    echo "Example:" >&2;
    echo "    snapshot-receive TAG_NAME_HERE" >&2;
    exit 4;
fi

###############################################################################

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

list_backups() {
    find "${target_dir}/${tag}" -maxdepth 1 ! -path "${target_dir}/${tag}" -type d;
}

###############################################################################

if [[ "${target_dir}" == "CHANGE_ME" ]]; then
    echo "Error: target_dir was not changed from the default value." >&2;
    exit 1;
fi

if [[ ! -d "${target_dir}" ]]; then
    echo "Error: No directory was found at '${target_dir}'." >&2;
    exit 2;
fi

if [[ ! -d "${target_dir}/${tag}" ]]; then
    log_msg "Creating new directory at ${target_dir}/${tag}";
    mkdir "${target_dir}/${tag}";
fi

log_msg "Launching btrfs in chroot mode";

time nice ionice -c Idle btrfs receive --chroot "${target_dir}/${tag}";

backups_count="$(echo -e "$(list_backups)" | wc -l)";

log_msg "Btrfs finished, we now have ${backups_count} backups:";
list_backups;

while [[ "${backups_count}" -gt "${max_backups}" ]]; do
    oldest_backup="$(list_backups | sort | head -n1)";
    log_msg "Maximum number backups is ${max_backups}, requesting removal of backup for $(dirname "${oldest_backup}")";

    btrfs subvolume delete "${oldest_backup}";

    backups_count="$(echo -e "$(list_backups)" | wc -l)";
done

log_msg "Done, any removed backups will be deleted in the background";

NAS Backups, Part 1: Overview

After building my nice NAS, the next thing on my sysadmin todo list was to ensure it is backed up. In this miniseries (probably 2 posts), I'm going to talk about the backup NAS that I've built. As of the time of typing, it is sucessfully backing up my Btrfs subvolumes.

In this first post, I'm going to give an overview of the hardware I'm using and the backup system I've put together. In future posts, I'll go into detail as to how the system works, and which areas I still need to work on moving forwards.

Personally, I find that the 3-2-1 backup strategy is a good rule of thumb:

  • 3 copies of the data
  • 2 different media
  • 1 off-site

What this means is tha you should have 3 copies of your data, with 2 local copies and one remote copy in a different geographical location. To achieve this, I have solved first the problem of the local backup copy, since it's a lot easier and less complicated than the remote one. Although I've talked about backups before (see also), in this case my solution is slightly different - partly due to the amount of data involved, and partly due to additional security measures I'm trying to get into place.

Hardware

For hardware, I'm using a Raspberry Pi 4 with 4GB RAM (the same as the rest of my cluster), along with 2 x 4 TB USB 3 external hard drives. This is a fairly low-cost and low-performance solution. I don't particularly care how long it takes the backup to complete, and it's relatively cheap to replace if it fails without being unreliable (Raspberry Pis are tough little things).

Here's a diagram of how I've wired it up:

(Can't see the above? Try a direct link to the SVG. Created with drawio.)

I use USB Y-cables to power the hard rives directly from the USB power supply, as the Pi is unlikely to be able to supply enough power for mulitple external hard drives on it's own.

Important Note: As I've discovered with a different unrelated host on my network, if you do this you can back-power the Pi through the USB Y cable, and potentially corrupt the microSD card by doing so. Make sure you switch off the entire USB power supply at once, rather than unplug just the Pi's power cable!

For a power supply, I'm using an Anker 10 port device (though I bought through Amazon, since I wasn't aware that Anker had their own website) - the same one that powers my Pi cluster.

Strategy

To do the backup itself I'm using the fact that I store my data in Btrfs subvolumes and the btrfs send / btrfs receive commands to send my subvolumes to the remote backup host over SSH. This has a number of benefits:

  1. The host doing the backing up has no access to the resulting backups (so if it gets infected it can't also infect the backups)
  2. The backups are read-only Btrfs snapshots (so if the backup NAS gets infected my backups can't be altered without first creating a read-write snapshot)
  3. Backups are done incrementally to save time, but a full backup is done automatically on the first run (or if the local metadata is missing)

While my previous backup solution using Restic for the server that sent you this web page has point #3 on my list above, it doesn't have points 1 and 2.

Restic does encrypt backups at rest though, which the system I'm setting up doesn't do unless you use LUKS to encrypt the underlying disks that Btrfs stores it's data on. More on that in the future, as I have tentative plans to deal with my off-site backup problem using a similar technique to that which I've used here that also encrypts data at rest when a backup isn't taking place.

In the next post, I'll be diving into the implementation details for the backup system I've created and explaining it in more detail - including sharing the pair of scripts that I've developed that do the heavy lifting.

Automatically downloading emails and extracting their attachments

I have an all-in-one printer that's also a scanner - specifically the Epson Ecotank 4750 (though annoyingly the automated document feeder doesn't support duplex). While it's a great printer (very eco-friendly, and the inks last for ages!), my biggest frustration with it is that it doesn't scan directly to an SMB file share (i.e. a Windows file share). It does support SANE though, which allows you to use it through a computer.

This is ok, but the ability to scan directly from the device itself without needing to use a computer was very convenient, so I set out to remedy this. The printer does have a cloud feature they call "Epson Connect", which allows one to upload to various cloud services such as Google Drive and Box, but I don't want to upload potentially sensitive data to such services.

Fortunately, there's a solution at hand - email! The printer in question also supports scanning to a an email address. Once the scanning process is complete, then it sends an email to the preconfigured email address with the scanned page(s) attached. It's been far too long since my last post about email too, so let's do something about that.

Logging in to my email account just to pick up a scan is clunky and annoying though, so I decided to automate the process to resolve the issue. The plan is as follows:

  1. Obtain a fresh email address
  2. Use IMAP IDLE to instantly download emails
  3. Extract attachments and save them to the output directory
  4. Discard the email - both locally and remotely

As some readers may be aware, I run my own email server - hence the reason why I wrote this post about email previously, so I reconfigured it to add a new email address. Many other free providers exist out there too - just make sure you don't use an account you might want to use for anything else, since our script will eat any emails sent to it.

Steps 2, 3, and 4 there took some research and fiddling about, but in the end I cooked up a shell script solution that uses fetchmail, procmail (which is apparently unmaintained, so I should consider looking for alternatives), inotifywait, and munpack. I've also packaged it into a Docker container, which I'll talk about later in this post.

To illustrate how all of these fit together, let's use a diagram:

A diagram showing how the whole process fits together - explanation below.

fetchmail uses IMAP IDLE to hold a connection open to the email server. When it receives notification of a new email, it instantly downloads it and spawns a new instance of procmail to handle it.

procmail writes the email to a temporary directory structure, which a separate script is watching with inotifywait. As soon as procmail finishes writing the new email to disk, inotifywait triggers and the email is unpacked with munpack. Any attachments found are moved to the output directory, and the original email discarded.

With this in mind, let's start drafting up a script. The first order of the day is configuring fetchmail. This is done using a .fetchmailrc file - I came up with this:

poll bobsrockets.com protocol IMAP port 993
    user "user@bobsrockets.com" with pass "PASSWORD_HERE"
    idle
    ssl

...where user@bobsrockets.com is the email address you want to watch, bobsrockets.com is the domain part of said email address (everything after the @), and PASSWORD_HERE is the password required to login.

Save this somewhere safe with tight file permissions for later.

The other configuration file we'll need is one for procmail. let's do that one now:

CORRECTHOME=/tmp/maildir
MAILDIR=$CORRECTHOME/

:0
Mail/

Replace /tmp/maildir with the temporary directory you want to use to hold emails in. Save this as procmail.conf for later too.

Now we have the mail config files written, we need to install some software. I'm using apt on Debian (a minideb Docker container actually), so you'll need to adapt this for your own system if required.

sudo apt install ca-certificates fetchmail procmail inotify-tools mpack
# or, if you're using minideb:
install_packages ca-certificates fetchmail procmail inotify-tools mpack

fetchmail is for some strange reason extremely picky about the user account it runs under, so let's update the pre-created fetchmail user account to make it happy:

groupadd --gid 10000 fetchmail
usermod --uid 10000 --gid 10000 --home=/srv/fetchmail --uid=10000 --gi=10000 fetchmail
chown fetchmail:fetchmail /srv/fetchmail

fetchmail now needs that config file we created earlier. Let's update the permissions on that:

chmod 10000:10000 path/to/.fetchmailrc

If you're running on bare metal, move it to the /srv/fetchmail directory now. If you're using Docker, keep reading, as I recommend that this file is mounted using a Docker volume to make the resulting container image more reusable.

Now let's start drafting a shell script to pull everything together. Let's start with some initial setup:

#!/usr/bin/env bash

if [[ -z "${TARGET_UID}" ]]; then
    echo "Error: The TARGET_UID environment variable was not specified.";
    exit 1;
fi
if [[ -z "${TARGET_GID}" ]]; then
    echo "Error: The TARGET_GID environment variable was not specified.";
    exit 1;
fi
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This Docker container must run as root because fetchmail is a pain, and to allow customisation of the target UID/GID (although all possible actions are run as non-root users)";
    exit 1;
fi

dir_mail_root="/tmp/maildir";
dir_newmail="${dir_mail_root}/Mail/new";
target_dir="/mnt/output";

fetchmail_uid="$(id -u "fetchmail")";
fetchmail_gid="$(id -g "fetchmail")";

temp_dir="$(mktemp --tmpdir -d "imap-download-XXXXXXX")";
on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

log_msg() {
    echo "$(date -u +"%Y-%m-%d %H:%M:%S") imap-download: $*";
}

This script will run as root, and fetchmail runs as UID 10000 and GID 10000, The reasons for this are complicated (and mostly have to do with my weird network setup). We look for the TARGET_UID and TARGET_GID environment variables, as these define the uid:gid we'll be setting files to before writing them to the output directory.

We also determine the fetchmail UID/GID dynamically here, and create a second temporary directory to work with too (the reasons for which will become apparent).

Before we continue, we need to create the directory procmail writes new emails to. Not because procmail won't create it on its own (because it will), but because we need it to exist up-front so we can watch it with inotifywait:

mkdir -p "${dir_newmail}";
chown -R "${fetchmail_uid}:${fetchmail_gid}" "${dir_mail_root}";

We're running as root, but we'll want to spawn fetchmail (and other things) as non-root users. Technically, I don't think you're supposed to use sudo in non-interactive scripts, and it's also not present in my Docker container image. The alternative is the setpriv command, but using it is rather complicated and annoying.

It's more powerful than sudo, as it allows you to specify not only the UID/GID a process runs as, but also the capabilities the process will have too (e.g. binding to low port numbers). There's a nasty bug one has to work around if one is using Docker too, so given all this I've written a wrapper function that abstracts all of this complexity away:

# Runs a process as another user.
# Ref https://github.com/SinusBot/docker/pull/40
# $1    The UID to run the process as.
# $2    The GID to run the process as.
# $3-*  The command (including arguments) to run
run_as_user() {
    run_as_uid="${1}"; shift;
    run_as_gid="${1}"; shift;
    if [[ -z "${run_as_uid}" ]]; then
        echo "run_as_user: No target UID specified.";
        return 1;
    fi
    if [[ -z "${run_as_gid}" ]]; then
        echo "run_as_user: No target GID specified.";
        return 2;
    fi

    # Ref https://github.com/SinusBot/docker/pull/40
    # WORKAROUND for `setpriv: libcap-ng is too old for "all" caps`, previously "-all" was used here
    # create a list to drop all capabilities supported by current kernel
    cap_prefix="-cap_";
    caps="$cap_prefix$(seq -s ",$cap_prefix" 0 "$(cat /proc/sys/kernel/cap_last_cap)")";

    setpriv --inh-caps="${caps}" --reuid "${run_as_uid}" --clear-groups --regid "${run_as_gid}" "$@";
    return "$?";
}

With this in hand, we can now wrap fetchmail and procmail in a function too:

do_fetchmail() {
    log_msg "Starting fetchmail";

    while :; do
        run_as_user "${fetchmail_uid}" "${fetchmail_gid}" fetchmail --mda "/usr/bin/procmail -m /srv/procmail.conf";

        exit_code="$?";
        if [[ "$exit_code" -eq 127 ]]; then
            log_msg "setpriv failed, exiting with code 127";
            exit 127;
        fi 

        log_msg "Fetchmail exited with code ${exit_code}, sleeping 60 seconds";
        sleep 60
    done
}

In short this spawns fetchmail as the fetchmail user we configured above, and also restarts it if it dies. If setpriv fails, it returns an exit code of 127 - so we catch that and don't bother trying again, as the issue likely needs manual intervention.

To finish the script, we now need to setup that inotifywait loop I mentioned earlier. Let's setup a shell function for that:


do_attachments() {
    while :; do # : = infinite loop
        # Wait for an update
        # inotifywait's non-0 exit code forces an exit for some reason :-/
        inotifywait -qr --event create --format '%:e %f' "${dir_newmail}";

        # Process new email here
    done
}

Processing new emails is not particularly difficult, but requires a sub loop because:

  • More than 1 email could be written at a time
  • Additional emails could slip through when we're processing the last one
while read -r filename; do

    # Process each email

done < <(find "${dir_newmail}" -type f);

Finally, we need to process each email we find in turn. Let's outline the steps we need to take:

  1. Move the email to that second temporary directory we created above (since the procmail directory might not be empty)
  2. Unpack the attachments
  3. chown the attach

Let's do this in chunks. First, let's move it to the temporary directory:

log_msg "Processing email ${filename}";

# Move the email to a temporary directory for processing
mv "${filename}" "${temp_dir}";

The filename environment variable there is the absolute path to the email in question, since we used find and passed it an absolute directory to list the contents of (as opposed to a relative path).

To find the filepath we moved it to, we need to do this:

filepath_temp="${temp_dir}/$(basename "${filename}")"

This is important for the next step, where we unpack it:

# Unpack the attachments
munpack -C "${temp_dir}" "${filepath_temp}";

Now that we've unpacked it, let's do a bit of cleaning up, by deleting the original email file and the .desc description files that munpack also generates:

# Delete the original email file and any description files
rm "${filepath_temp}";
find "${temp_dir}" -iname '*.desc' -delete;

Great! Now we have the attachments sorted, now all we need to do is chown them to the target UID/GID and move them to the right place.

chown -R "${TARGET_UID}:${TARGET_GID}" "${temp_dir}";
chmod -R a=rX,ug+w "${temp_dir}";

I also chmod the temporary directory too to make sure that the permissions are correct, because otherwise the mv command is unable to read the directory's contents.

Now to actually move all the attachments:

# Move the attachment files to the output directory
while read -r attachment; do
    log_msg "Extracted attachment ${attachment}";
    chmod 0775 "${attachment}";
    run_as_user "${TARGET_UID}" "${TARGET_GID}" mv "${attachment}" "${target_dir}";
done < <(find "${temp_dir}" -type f);

This is rather overcomplicated because of an older design, but it does the job just fine.

With that done, we've finished the script. I'll include the whole script at the bottom of this post.

Dockerification

If you're running on bare metal, then you can skip to the end of this post. Because I have a cluster, I want to be able to run this thereon. Since said cluster works with Docker containers, it's natural to Dockerise this process.

The Dockerfile for all this is surprisingly concise:

(Can't see the above? View it on my personal Git server instead)

To use this, you'll need the following files alongside it:

It exposes the following Docker volumes:

  • /mnt/fetchmailrc: The fetchmailrc file
  • /mnt/output: The target output directory

All these files can be found in this directory on my personal Git server.

Conclusion

We've strung together a bunch of different programs to automatically download emails and extract their attachments. This is very useful as for ingesting all sorts of different files. Things I haven't covered:

  • Restricting it to certain source email addresses to handle spam
  • Restricting the file types accepted (the file command is probably your friend)
  • Disallowing large files (most 3rd party email servers do this automatically, but in my case I don't have a limit that I know of other than my hard disk space)

As always, this blog post is both a reference for my own use and a starting point for you if you'd like to do this for yourself.

If you've found this useful, please comment below! I find it really inspiring / motivating to learn how people have found my posts useful and what for.

Sources and further reading

run.sh script

(Can't see the above? Try a this link, or alternatively this one (bash))

Servers demystified

Something I see a lot of around the Internet are people who think that you need to purchase a big (often rack-mounted) "server" in order to host things like websites, email, game servers, and more (exhibit a). Quite often, they turn to ebay to purchase used enterprise rack mounted servers too.

I want to take a moment here to write up my thoughts here on why that is almost never the correct approach for a home user to take to host such applications at home, and what the (much better) alternatives are to serve as a reference post I can direct people to who need educating about this important issue.

What is a "server"?

A server can mean 2 things: a physical computer whose primary role is to act as a server, and server applications, which serve content to other users elsewhere - be it phones, laptops, desktops, etc.

A lot of people new to the field don't realise it, but any computer can take on the role of a server - you don't need any fancy hardware. The things that a computer does is defined by the software it runs - not the hardware that it is built from.

Does a server need a graphics card (GPU)?

No. It really doesn't. It's extremely unlikely that for a general purpose server you would need a GPU. Another related myth here is that you need a GPU in your server if you're running a game server. This is also false. Most of the time a server is going to be running headlessly (i.e. without a monitor) - so it really doesn't need a GPU in order to function effectively.

The following tasks however may require a GPU:

  • Serious Machine Learning / Artificial Intelligence workloads
  • 3D Rendering (e.g. Blender)
  • Live video streaming (video transcoding does not always utilise the GPU, as far as I can tell - make sure you check the documentation for your video editing software before buying any hardware)

Web servers, game servers, email servers, and other application servers do not use and cannot make use of a GPU. Programs need to be specially designed to support GPUs.

I need to purchase a license for Windows Server. Windows 10 isn't enough.

This is false. If you prefer Windows, then a regular old Windows 10 machine will be just fine for most home server use-cases. Windows Server provides additional features for enterprise that you are unlikely to need.

Personally, I recommend running a distribution of Linux though such as Ubuntu Server.

The problems with used hardware

Of particular frustration is the purchasing of old used (often rack mountable) servers from eBay and other auction sites. The low prices might be attractive, but such servers will nearly always have a number of issues:

  1. The CPU and other components will frequently be 10+ years old, and draw lots of electricity
  2. The fans will be very loud - sounding like a jet is taking off inside your house
  3. They often don't come with hard drives, and often have custom drive bays that require purchasing expensive drives to fill

Awkward issues to be sure! Particularly of note here is the electricity problem. Very old devices draw orders of magnitude more power than newer ones - leading to a big electricity bill. It will practically always be cheaper to purchase a newer more expensive machine - it'll pay for itself in dramatically lower electricity bills.

What are the alternatives?

Many far more suitable alternatives exist. They fall into 2 categories:

  1. Renting from a hosting company
  2. Buying a physical device

I'll be talking through both of these options below.

Renting from a hosting company

If you'd rather not have any hardware of your own locally, you can always rent a server from a hosting company. These come in 1 flavours:

  • Virtual Private Servers (VPS): A virtual machine running on the hosting company's infrastructure. Often easier to scale to multiple machines.
  • Dedicated servers: Bare-metal hardware running in a hosting company's datacentre somewhere. Useful if you've outgrown a VPS.

Example providers include OVH, Kimsufi (dedicated servers), Digital Ocean, and many more.

Things to watch out for when choosing one include:

  • How can you get support if you have an issue?
  • What network speeds are provided? Are there any data caps?
  • How much hard drive space do they come with? You often can't get any additional hard drive space once you've bought it without switching to a new host.
  • How many CPU cores does it have (or, if you want to run a game server, what's the clock speed)?
  • How much RAM does it have?
  • How much is it per month?

Buying a physical device

If you'd rather buy a physical device (beware that email servers cannot be effectively hosted on a residential Internet connection), then I can recommend either looking into one of these 2:

  1. An Intel NUC or other Mini PC in the same form factor
  2. A Raspberry Pi (or, for more advanced users, I've heard good things about a Rock Pi, but haven't tried it myself)

Both options are quiet, reasonably priced, and will draw orders of magnitude less power than a big rack mounted server.

A notable caveat here is that if you intend to run a game server, you'll want to check the CPU architecture it runs on, as it may not be compatible with the Raspberry Pi (which has an ARM chip built it - which can be either arm64 or armv7l - I use the official Debian CPU architecture codes here to avoid ambiguity).

Other alternatives here include old laptops and desktops you already have lying around at home. Make sure they aren't too old though, because otherwise you'll run afoul of point #1 in my list of problems there above.

Conclusion

In this post, I've busted some common myths about serves. I've also taken a quick look some appropriate hardware that you can buy or rent to use as a server.

If you're in the market for a server, don't be fooled by low prices for used physical servers. Rather, either rent one from a hosting company, or buy a Mini PC or Raspberry Pi instead. It'll run quieter and use less power too.

Other common questions I see are how to get started with running various different applications on a server. This is out of scope of this article, but there are plenty of tutorials out there on how to do this.

Often you'll need some basic Linux terminal skills to follow along though - I've written a blog post about how you can get started with the terminal already. I also on occasion post tutorials here on this blog on how to setup various applications - these are usually tagged with tutorial and server.

Other sites have excellent tutorials on to setup all manner of different applications - I'll leave a bunch of links at the end of this post.

If this this post has helped demystify servers for you, please consider sharing it with others to clear up their misconceptions too.

Sources and Further Reading

Art by Mythdael