Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems performance photos php pixelbot portable privacy problem solving programming problems projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg technical terminal textures three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Animated PNG for all!

I recently discovered that Animated PNGs are now supported by most major browsers:

Data on support for the apng feature across the major browsers from

I stumbled across the concept of an animated PNG a number of years ago (on actually if I remember right!), but at the time browser support was very bad (read: non-existent :P) - so I moved on to other things.

I ended up re-discovering it a few weeks ago (also through!), and since browser support is so much better now I decided that I just had to play around with it.

It hasn't disappointed. Traditional animated GIFs (Graphics Interchange Format for the curious) are limited to 256 colours, have limited transparency support (a pixel is either transparent, or it isn't - there's no translucency support), and don't compress terribly well.

Animated PNGs (Portable Network Graphics), on the other hand, let you enjoy all the features of a regular PNG (as many colours as you want, translucent pixels, etc.) while also supporting animation, and compressing better as well! Best of all, if a browser doesn't support the animated PNG standard, they will just see and render the first frame as a regular PNG and silently ignore the rest.

Let's take it out for a spin. For my test, I took an image and created a 'panning' animation from one side of it to the other. Here's the image I've used:

(Credit: The background on the download page for Mozilla's Firefox Nightly builds. It isn't available on the original source website anymore (and I've lost the link), but can still be found on various wallpaper websites.)

The first task is to generate the frames from the original image. I wrote a quick shell script for this purpose. Firstly, I defined a bunch of variables:

#!/usr/bin/env bash
set -e; # Crash if we hit an error


# The maximum number of frames
end_x=960; end_y=450;
start_x=0; start_y=450;

crop_size_x=960; crop_size_y=540;

mkdir -p ./frames;
Variable Meaning
input_file The input file to generate frames from
output_file The file to write the animated png to
max_frame The number of frames (plus 1) to generate
start_x The starting position to pan from on the x axis
start_y The starting position to pan from on the y axis
end_x The ending position to pan to on the x axis
end_y The ending position to pan to on the y axis
crop_size_x The width of the cropped frames
crop_size_y The height of the cropped frames

It's worth noting here that it's probably a good idea to implement a proper CLI to this script at this point, but since it's currently only a one-off script I didn't bother. Perhaps in the future I'll come back and improve it if I find myself needing it again for some other purpose.

With the parameters set up (and a temporary directory created - note that you should use mktemp -d instead of the approach I've taken here!), we can then use a for loop to repeatedly call ImageMagick to crop the input image to generate our frames. This won't run in parallel unfortunately, but since it's only a few frames it shouldn't take too long to render. This is only a quick shell script after all :P

for ((frame=0; frame <= "${max_frame}"; frame++)); do
    this_x="$(calc -p "${start_x}+((${end_x}-${start_x})*(${frame}/${max_frame}))")";
    this_y="$(calc -p "${start_y}+((${end_y}-${start_y})*(${frame}/${max_frame}))")";
    percent="$(calc -p "round((${frame}/${max_frame})*100, 2)")";

    convert "${input_file}" -crop "${crop_size_x}x${crop_size_y}+${this_x}+${this_y}" "frames/Frame-$(printf "%02d" "${frame}").jpeg";

    echo -ne "${frame} / ${max_frame} (${percent}%)          \r";
echo ""; # Move to the line after the progress indicator

This looks complicated, but it really isn't. The for loop iterates over each of the frame numbers. We do some maths to calculate the (x, y) co-ordinates of the frame we want to extract, and then get ImageMagick's convert command to do the cropping for us. Finally we write a quick progress indicator out to stdout (\r resets the cursor to the beginning of the line, but doesn't go down to the next one).

The maths there is probably better represented like this:



Much better :-) In short, we convert the current frame number into a percentage (between 0 and 1) of how far through the animation we are and use that as a multiplier to find the distance between the starting and ending points.

I use the calc command-line tool (in the apcalc package on Ubuntu) here to do the calculations, because the bash built-in result=$(((4 + 5))) only supports integer-based maths, and I wanted floating-point support here.

With the frames generated, we only need to stitch them together to make the final animated png. Unfortunately, an easy-to-use tool does not yet exist (like gifsicle for GIFs) - so we'll have to make-do with ffpmeg (which, while very powerful, has a rather confusing CLI!). Here's how I stitched the frames together:

ffmpeg -r 10 -i frames/Frame-%02d.jpeg -plays 0 "${output_file}";
  • -r - The frame rate
  • -i - The input filename
  • -plays - The number of times to loop (0 = infinite; defaults to no looping if omitted)
  • "${output_file}" - The output file

Here's the final result:

(Filesize: ~2.98MiB)

Of course, it'd be cool to compare it to a traditional animated gif. Let's do that too! First, we'll need to convert the frames to gif (gifsicle, our tool of choice, doesn't support anything other than GIFs as far as I'm aware):

mogrify -format gif frames/*.jpeg

Easy-peasy! mogrify is another tool from ImageMagick that makes such in-place conversions trivial. Note that the frames themselves are stored as JPEGs because I was experiencing an issue whereby each of the frames apparently had a slightly different colour palette, and ffmpeg wasn't smart enough to correct for this - choosing instead to crash.

With the frames converted, we can make our animated GIF like so:

gifsicle --optimize --colors=256 --loopcount=10 --delay=10 frames/*.gif >Output.gif;

Lastly, we probably want to delete the intermediate frames:

rm -r ./frames

Here's the animated GIF version:

(Filesize: ~3.28MiB)

Woah! That's much bigger.

A graph comparing the sizes of the APNG and GIF versions of the animation.

(Generated (and then extracted & edited with the Firefox developer tools) from Meta-Chart)

It's ~9.7% bigger in fact! Though not a crazy amount, smaller resulting files are always good. I suspect that this effect will stack the more frames you have. Others have tested this too, finding pretty similar results to those that I've found here - though it does of course depend on your specific scenario.

With that observation, I'll end this blog post. The next time you think about inserting an animation into a web page or chat window, consider making it an Animated PNG.

Found this interesting? Found a cool CLI tool for manipulating APNGs? Having trouble? Comment below!

Sources and Further Reading

Keeping the Internet free and open for years to come - #ForTheWeb

I've recently come across It sounds obvious, but certain qualities of the web that we take for granted aren't, in fact, universal. Things like a lack of censorship (China & their great firewall; Cuba; Venezuela), consumer privacy (US Mobile Carriers; Google; Google/Android Antitrust), and fair pricing (AT&T; BT) are rather a problem in more places than is noticeable at first glance.

The Contract for the Web is a set of principles - backed by the famous Tim Berners-Lee - for a free, open, and fair Internet. The aim is to build a full contract based on these principles to guide the evolution of the web for year to come.

Personally, I first really started to use the web to learn how to program. By reading (lots) of tutorials freely available on the web, I learnt to build websites, write Javascript, and more!

Since then, I've used the web to share the things I've been learning on this blog, stay up-to-date with the latest technology news, research new ideas, and play games that deliver amazing experiences.

The web (and, by extension, the Internet - the web refers to just HTTP + HTTPS) for me represents freedom of information. Freedom to express (and learn about) new thoughts and ideas without fear of being censored. Freedom to communicate with anyone in the world - regardless of physical distances.

It is for this reason that I've signed the Contract for the Web. It's my hope that this effort will ensure that the Internet becomes more open and neutral going forwards, so that everyone can experience the benefits of the open web for a long time to come.

Markov Chains Part 4: Test Data

With a shiny-new markov chain engine (see parts 1, 2, and 3), I found that I had a distinct lack of test data to put through it. Obviously this was no good at all, so I decided to do something about it.

Initially, I started with a list of HTML colours (direct link; 8.6KiB), but that didn't produce very good output:

MarkovGrams/bin/Debug/MarkovGrams.exe markov-w --wordlist wordlists/Colours.txt --length 16

I see a few problems here. Firstly, it's treating each word as it's entity, where in fact I'd like it to generate n-grams on a line-by-line basis. Thankfully, this is easy enough with my new --no-split option:

MarkovGrams/bin/Debug/MarkovGrams.exe markov-w --wordlist wordlists/Colours.txt --no-split --length 16
med carrylight b
jungin pe red dr
ureelloufts blue
uamoky bluellemo
trinaterry aupph
utatellon reep g
radolitter brast
bian reep mardar
ght burnse greep

Hrm, that's still rather unreadable. What if we make the n-grams longer by bumping the order?

MarkovGrams/bin/Debug/MarkovGrams.exe markov-w --wordlist wordlists/Colours.txt --length 16 --order 4
on fuchsia blue 
rsity of carmili
e blossom per sp
ulean red au lav
as green yellowe
ly gray aspe
disco blus
berry pine blach

Better, but it looks like it's starting the generation process from inside the middle of words. We can fix that with my new --start-uppercase option, which ensures that each output always stars with an n-gram that begins with a capital letter. Unfortunately the wordlist is all lowercase:

air force blue
alice blue
alizarin crimson
american rose
android green
anti-flash white

This is an issue. The other problem is that with an order of 4 the choice-point ratio is dropping quite low - I got a low of just ~0.97 in my testing.

The choice-point ratio is a measure I came up with of the average number of different directions the engine could potential go in at each step of the generation process. I'd like to keep this number consistently higher than 2, at least - to ensure a good variety of output.

Greener Pastures

Rather than try fix that wordlist, let's go in search of something better. It looks like the CrossCode Wiki has a page that lists all the items in the entire game. That should do the trick! The only problem is extracting it though. Let's use a bit of bash! We can use curl to download the HTML of the page, and then xidel to parse out the text from the <a> tags inside tables. Here's what I came up with:

curl | xidel --data - --css "table a"

This is a great start, but we've got blank lines in there, and the list isn't sorted alphabetically (not required, but makes it look nice :P). Let's fix that:

curl | xidel --data - --css "table a" | awk "NF > 0" | sort

Very cool. Tacking wc -l on the end of the pipe chain I can we've got ourselves a list of 527(!) items! Here's a selection of input lines:

Rough Branch
Raw Meat
Shady Monocle
Blue Grass
Tengu Mask
Crystal Plate
Humming Razor
Everlasting Amber
Tracker Chip
Lawkeeper's Fist

Let's run it through the engine. After a bit of tweaking, I came up with this:

cat wordlists/Cross-Code-Items.txt | MarkovGrams/bin/Debug/MarkovGrams.exe markov-w --start-uppercase --no-split --length 16 --order 3
Capt Keboossauci
Fajiz Keblathfin
King Steaf Sharp
Stintze Geakralt
Fruisty 'olipe F
Apper's TN
Prow Peptumn's C
Rus Recreetan Co
Veggiel Spiragma
Laver's Bolden M

That's quite interesting! With a choice-point ratio of ~5.6 at an order of 3, we've got a nice variable output. If we increase the order to 4, we get ~1.5 - ~2.3:

Edgy Hoo
Junk Petal Goggl
Red Metal Need C
Samurai Shel
Krystal Wated Li
Sweet Residu
Raw Stomper Thor
Purple Fruit Dev

It appears to be cutting off at the end though. Not sure what we can do about that (ideas welcome!). This looks interesting, but I'm not done yet. I'd like it to work on word-level too!

Going up a level

After making some pretty extensive changes, I managed to add support for this. Firstly, I needed to add support for word-level n-gram generation. Currently, I've done this with a new GenerationMode enum.

public enum GenerationMode

Under the hood I've just used a few if statements. Fortunately, in the case of the weighted generator, only the bottom method needed adjusting:

/// <summary>
/// Generates a dictionary of weighted n-grams from the specified string.
/// </summary>
/// <param name="str">The string to n-gram-ise.</param>
/// <param name="order">The order of n-grams to generate.</param>
/// <returns>The weighted dictionary of ngrams.</returns>
private static void GenerateWeighted(string str, int order, GenerationMode mode, ref Dictionary<string, int> results)
    if (mode == GenerationMode.CharacterLevel) {
        for (int i = 0; i < str.Length - order; i++) {
            string ngram = str.Substring(i, order);
            if (!results.ContainsKey(ngram))
                results[ngram] = 0;
    else {
        string[] parts = str.Split(" ".ToCharArray());
        for (int i = 0; i < parts.Length - order; i++) {
            string ngram = string.Join(" ", parts.Skip(i).Take(order)).Trim();
            if (ngram.Trim().Length == 0) continue;
            if (!results.ContainsKey(ngram))
                results[ngram] = 0;

Full code available here. After that, the core generation algorithm was next. The biggest change - apart from adding a setting for the GenerationMode enum - was the main while loop. This was a case of updating the condition to count the number of words instead of the number of characters in word mode:

(Mode == GenerationMode.CharacterLevel ? result.Length : result.CountCharInstances(" ".ToCharArray()) + 1) < length

A simple ternary if statement did the trick. I ended up tweaking it a bit to optimise it - the above is the end result (full code available here). Instead of counting the words, I count the number fo spaces instead and add 1. That CountCharInstances() method there is an extension method I wrote to simplify things. Here it is:

public static int CountCharInstances(this string str, char[] targets)
    int result = 0;
    for (int i = 0; i < str.Length; i++) {
        for (int t = 0; t < targets.Length; t++)
            if (str[i] == targets[t]) result++;
    return result;

Recursive issues

After making these changes, I needed some (more!) test data. Inspiration struck: I could run it recipe names! They've quite often got more than 1 word, but not too many. Searching for such a list proved to be a challenge though. My first thought was BBC Food, but their terms of service disallow scraping :-(

A couple of different websites later, and I found the Recipes Wikia. Thousands of recipes, just ready and waiting! Time to get to work scraping them. My first stop was, naturally, the sitemap (though how I found in the first place I really can't remember :P).

What I was greeted with, however, was a bit of a shock:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="">
<!-- Generation time: 26ms -->
<!-- Generation date: 2018-10-25T10:14:26Z -->

Like who has a sitemap of sitemaps, anyways?! We better do something about this: Time for some more bash! Let's start by pulling out those sitemaps.

curl | xidel --data - --css "loc"

Easy peasy! Next up, we don't want that bottom one - as it appears to have a bunch of discussion pages and other junk in it. Let's strip it out before we even download it!

curl | xidel --data - --css "loc" | grep -i NS_0

With a list of sitemaps extract from the sitemap (completely coconuts I tell you) extracted, we need to download them all in turn and extract the page urls therein. This is, unfortunately, where it starts to get nasty. While a simple xargs call downloads them all easily enough (| xargs -n1 -I{} curl "{}" should do the trick), this outputs them all to stdout, and makes it very difficult for us to parse them.

I'd like to avoid shuffling things around on the file system if possible, as this introduces further complexity. We're not out of options yet though, as we can pull a subshell out of our proverbial hat:

curl | xidel --data - --css "loc" | grep -i NS_0 | xargs -n1 -I{} sh -c 'curl {} | xidel --data - --css "loc"'

Yay! Now we're getting a list of urls to all the pages on the entire wiki:

One problem though: We want recipes names, not urls! Let's do something about that. Our next special guest that inhabits our bottomless hat is the illustrious sed. Armed with the mystical power of find-and-replace, we can make short work of these urls:

... | sed -e 's/^.*\///g' -e 's/_/ /g'

The rest of the command is omitted for clarity. Here I've used 2 sed scripts: One to strip everything up to the last forward slash /, and another to replace the underscores _ with spaces. We're almost done, but there are a few annoying hoops left to jump through. Firstly, there are A bunch of unfortunate escape sequences lying around (I actually only discovered this when the engine started spitting out random ones :P). Also, there are far too many page names that contain the word Nutrient, oddly enough.

The latter is easy to deal with. A quick grep sorts it out:

... | grep -iv "Nutrient"

The former is awkward and annoying. As far as I can tell, there's no command I can call that will decode escape sequences. To this end, I wound up embedding some Python:

... | python -c "import urllib, sys; print urllib.unquote(sys.argv[1] if len(sys.argv) > 1 else[0:-1])"

This makes the whole thing much more intimidating that it would otherwise be. Lastly, I'd really like to sort the list and save it to a file. Compared to the above, this is chicken feed!

| sort >Dishes.txt

And there we have it. Bash is very much like lego bricks when you break it down. The trick is to build it up step-by-step until you've got something that does what you want it to :)

Here's the complete command:

curl | xidel --data - --css "loc" | grep -i NS_0 | xargs -n1 -I{} sh -c 'curl {} | xidel --data - --css "loc"' | sed -e 's/^.*\///g' -e 's/_/ /g' | python -c "import urllib, sys; print urllib.unquote(sys.argv[1] if len(sys.argv) > 1 else[0:-1])" | grep -iv "Nutrient" | sort >Dishes.txt

After all that effort, I think we deserve something for our troubles! With ~42K(!) lines in the resulting file (42,039 to be exact as of the last time I ran the monster above :P), the output (after some tweaking, of course) is pretty sweet:

cat wordlists/Dishes.txt | mono --debug MarkovGrams/bin/Debug/MarkovGrams.exe markov-w --words --start-uppercase --length 8
Lemon Lime Ginger
Seared Tomatoes and Summer Squash
Cabbage and Potato au
Endive stuffed with Lamb and Winter Vegetable
Stuffed Chicken Breasts in Yogurt Turmeric Sauce with
Blossoms on Tomato
Morning Shortcake with Whipped Cream Graham
Orange and Pineapple Honey
Mango Raspberry Bread
Tempura with a Southwestern
Rice Florentine with
Cabbage Slaw with Pecans and Parmesan
Pork Sandwiches with Tangy Sweet Barbecue
Tea with Lemongrass and Chile Rice
Butterscotch Mousse Cake with Fudge
Fish and Shrimp -
Cucumber Salad with Roast Garlic Avocado
Beans in the Slow
Apple-Cherry Vinaigrette Salad
California Avocado Chinese Chicken Noodle Soup with Arugula

...I really need to do something about that cutting off issue. Other than that, I'm pretty happy with the result! The choice-point ratio is really variable, but most of the time it's hovering around ~2.5-7.5, which is great! The output if I lower the order from 3 to 2 isn't too bad either:

Salata me Htapodi kai
Hot 'n' Cheese Sandwich with Green Fish and
Poisson au Feuilles de Milagros
Valentines Day Cookies with Tofu with Walnut Rice
Up Party Perfect Turkey Tetrazzini
Olives and Strawberry Pie with Iceberg Salad with
Mashed Sweet sauce for Your Mood with Dried
Zespri Gold Corn rice tofu, and Corn Roasted
California Avocado and Rice Casserole with Dilled Shrimp
Egyptian Tomato and Red Bell Peppers, Mango Fandango

This gives us a staggering average choice-point ratio of ~125! Success :D

One more level

After this, I wanted to push the limits of the engine, so see what it's capable of. The obvious choice here is Shakespeare's Complete Works (~5.85MiB). Pushing this through the engine required some work, as ~30 seconds is far too slow - namely optimising the pipeline as much as possible.

The Mono Profiler helped a lot here. With it, I discovered that string.StartsWith() is really slow. Like, ridiculously slow (though this is relative, since I'm calling it hundreds of thousand of times), as it's culture-aware. In our case, we can't be bothering with any of that, as it's not relevant anyway. The easiest solution is to write another extension method:

public static bool StartsWithFast(this string str, string target) {
    if (str.Length < target.Length) return false;
    return str.Substring(0, target.Length) == target;

string.Substring() is faster, so by utilising this instead of the regular string.StartsWith() yields us a perfectly gigantic boost! Next up, I noticed that I can probably parallelize the Linq query that builds the list of possible n-grams we can choose from next, so that it runs on all the CPU cores:

Parallel.ForEach(ngrams, (KeyValuePair<string, double> ngramData) => {
    if (!ngramData.Key.StartsWithFast(nextStartsWith)) return;
    if (!convNextNgrams.TryAdd(ngramData.Key, ngramData.Value))
        throw new Exception("Error: Failed to add to staging ngram concurrent dictionary");

Again, this netted a another huge gain. With this and a few other architectural changes, I was able to chop the time down to a mere ~4 seconds (for a standard 10 words)! In the future, I might experiment with selective unmanaged code via the unsafe keyword to see if I can do any better.

For now, it's fast enough to enjoy some random Shakespeare on-demand:

What should they since that to that tells me Nero
He doth it turn and and too, gentle
Ha! you shall not come hither; since that to provoke
ANTONY. No further, sir; a so, farewell, Sir
Bona, joins with
Go with your fingering,
From fairies and the are like an ass which is
The very life-blood of our blood, no, not with the
Alas, why is here-in which
Transform'd and weak'ned? Hath Bolingbroke

Very interesting. The choice-point ratios sit at ~20 and ~3 for orders 3 and 4 respectively, though I got as high as 188 for an order of 3 during my testing. Definitely plenty of test data here :P


My experiments took me to several other places - which, if I included them all here, would result in an post much much longer than this! I scripted the download of several other wordlists in (direct link, 4.2KiB), if you're interested, with ready-downloaded copies in the wordlists folder of the repository.

I would advise reading the table in the README that gives credit to where I sourced each list, because of course I didn't put any of them together myself - I just wrote the script :P

Particularly of note is the Starbound list, which contains a list of all the blocks and items from the game Starbound. I might post about that one separately, as it ended up being a most interesting adventure.

In the future, I'd like to look at implementing a linguistic drift algorithm, to try and improve the output of the engine. The guy over at Here Dragons Abound has a great post on the subject, which you should definitely read if you're interested.

Found this interesting? Got an idea for another wordlist I can push though my engine? Confused by something? Comment below!

Sources and Further Reading

How to set up a WebDav share with Nginx

I've just been setting up a WebDav share on a raspberry pi 3 for my local network (long story), and since it was a bit of a pain to set up (and I had to combine a bunch of different tutorials out there to make mine work), I thought I'd share how I did it here.

I'll assume you have a raspberry pi all set up and up-to-date in headless mode, with a ufw for your firewall (if you need help with this, post in the comments below or check out the Raspberry Pi Stack Exchange). To start with, we need to install the nginx-full package:

sudo apt update
sudo apt install  nginx-full

Note that we need the nginx-full package here, because the nginx-extras or just simply nginx packages don't include the required additional webdav support modules. Next, we need to configure Nginx. Nginx's configuration files live at /etc/nginx/nginx.conf, and in /etc/nginx/conf.d. I did something like this for my nginx.conf:

user www-data;
worker_processes 4;
pid /run/;

events {
    worker_connections 768;
    # multi_accept on;

http {

    # Basic Settings

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    # server_tokens off;

    # server_names_hash_bucket_size 64;
    # server_name_in_redirect off;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # SSL Settings

    ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
    ssl_prefer_server_ciphers on;

    # Logging Settings

    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log;

    # Gzip Settings

    gzip on;

    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Virtual Host Configs

    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;

Not many changes here. Then, I created a file called 0-webdav.conf in the conf.d directory, and this is what I put in it:

server {
    listen 80;
    listen [::]:80;


    auth_basic              realm_name;
    auth_basic_user_file    /etc/nginx/.passwords.list;

    dav_methods     PUT DELETE MKCOL COPY MOVE;
    dav_ext_methods PROPFIND OPTIONS;
    dav_access      user:rw group:rw all:r

    client_body_temp_path   /tmp/nginx/client-bodies;
    client_max_body_size    0;
    create_full_put_path    on;

    root /mnt/hydroplans;

Now this is where the magic happens. The dav_access directive tells nginx to allow everyone to read, but only logged in users to write to the share. This isn't actually particularly relevant, because of the auth_basic and auth_basic_user_file directives, which tell nginx to require people to login to the share before they are allowed to access it.

It's also important to note that I found that Windows (10, at least), didn't like the basic authentication - even though Ubuntu's Nautilus accepted it just fine - so I had to comment that bit out :-(

If you do still want authentication (hey! May you'll have better luck than I :P), then you'll need to set up the passwords file. Here's how you create it:

echo -n 'helen:' | sudo tee /etc/nginx/.passwords.list
openssl passwd -apr1 | sudo tee -a /etc/nginx/.passwords.list 

The above creates a user called helen, and asks you to type a password. If you're adding another user to the file, simply change the first tee to be tee -a to avoid overwriting the first one.

With that all configured, it's time to test the configuration file, and, if we're lucky, restart nginx!

sudo nginx -t
sudo systemctl restart nginx

That's all you should need to do to set up a simple WebDav share. Remember that this is a starting point, and not an ending point - there are a few big holes in the above that you'll need to address, depending on your use case (for example, I haven't included the setup of https / encryption - try letsencrypt for that).

Here are the connection details for the above for a few different clients:

  • Ubuntu / Nautilus: (Go to "Other Locations" and paste this into the "Connect to Server" box) dav://
  • Windows: (Go to "Map Network Drive" and paste this in)

Did this work for you? Have any problems? Got instructions for a WebDav client not listed here? Let me know in the comments!

Website Integrations (Mini!) Series List

Since I ended up doing a mini-series on the various website integrations I implemented, I thought that you might find a (mini-)series list useful. Here it is:

That's just about all I've got to mention here. Do you have any suggestions and / or requests on what I should blog about? Let me know below in the comments!

Website Integrations #3: Twitter cards

The twitter logo. (The posts featured in the above images are this one about my new Raspberry Pi 3, and the latest coding conundrums evolved post).

You have arrived in the third of three parts in my mini-series on how I implemented rich snippets. In the last two parts I tackled open graph and becoming an oEmbed provider. In this part, I'll be talking a bit about twitter cards, and how I implemented them.

Twitter's take on the problem seems to be much simpler than Facebook's, which makes for easy implementing :D Like in the other two protocols too, they decided to have multiple different types of, well, in this case, cards. I decided to implement the summary card type. Like open graph, it adds a bunch of <meta /> tags to the header. Sigh. Anyway, here are the property names I needed to implement:

  • twitter:card - The type of card. In my case this is set to summary
  • twitter:site - This one's confusing. Although it's called 'site', it should actually be set to your own twitter handle - mine is @SBRLabs.
  • twitter:title - The title of the content. Practically identical to open graph's og:title.
  • twitter:description - The description of the content. The same as og:description.
  • twitter:image - A url pointing to an image that should be displayed next to the title and description. Unlike Facebook's open graph, twitter appears to support https urls here with no problem at all.

Since after implementing open graph I already had 90% of the infrastructure and calculations in place already, throwing together values for the above wasn't too difficult. Here's an example set of twitter card <meta /> tags generated by the updated code:

<meta property="twitter:card" content="summary" />
<meta property="twitter:site" content="@SBRLabs" />
<meta property="twitter:title" content="Running Prolog on Linux" />
<meta property="twitter:description" content="Hello! I hope you had a nice restful Easter. I've been a bit busy this last 6 months, but I've got a holiday at the moment, and I've just received a .... (click to read more)" />
<meta property="twitter:image" content="" />

Easy peasy. Next up was testing time. Thankfully, Twitter made this easy too by providing an official testing tool. Interestingly, they whitelist domains based on whether the webmaster has run a url through their tool - so if you want twitter cards to show up, make sure you plug at least one of your website's page urls through their tool.

After a few tweaks, I got this:

The whitelisted message from the twitter card tester.

With that, my work was complete. This brings us to the end of my mini-series on rich-snippet integrations (unless I've missed a protocol O.o Comment below if I have)! I hope you've found it useful. If you have (or even if you haven't!) please let me know in the comments below :D

Website Integrations #2: oEmbed

Welcome to part 2 of this impromptu miniseries! In this second part of three, I'll be showing you a little about how I set up and tested a simple oEmbed provider for my blog posts - I've seen lots of oEmbed client information out there, but not much in the way of provider (or server) implementations.

If you haven't read part one about the open graph protocol yet, then you might find it interesting.

oEmbed is a bit different to open graph in that instead of throwing a bunch of meta tags into your <head />, you instead use a special <link /> element that points interested parties in the direction of some nice tasty json. Personally, I find this approach to be more sensible and easier to handle - the kind of thing you'd expect from an open standard.

To start with, I took a read of their specification, as I did with open graph. It doesn't have as many examples as I'd have liked, and I had to keep jumping around, but it's certainly not the worst I've seen.

oEmbed is built on the idea of providers (that's me!) and consumers (the programs and website you use). Providers, erm, provide machine-readable information about urls passed to them, and consumers take this information provided to them and display it to the user in a manner they think is appropriate.

To start with, I created a new PHP file to act as my provider over at and took a look at the different oEmbed types available - oEmbed has a type system of sorts, similar to open graph. I decided on link - while a rich would look cool, it would be almost impossible to test with every client out there, and I can't guarantee how the html would be rendered or what space it would have either.

With that decided, I made a list of the properties that I'd need to include in the json response:

  • version - The version of oEmbed. Currently 1.0 as of the time of typing.
  • type - The oEmbed type. I chose link here.
  • title - The title of the page
  • author_name - The name of the author
  • author_url - A link to the author's homepage.
  • provider_name - The provider's name.
  • provider_url - A link to the provider's homepage. I chose my blog index, since this script will only serve my blog.
  • cache_age - How long consumers should cache the response for. I put 1 hour (3600 seconds) here, since I usually correct mistakes after posting that I've missed, and I want them to go out fairly quickly.
  • thumbnail_url - A link to a suitable thumbnail picture.
  • thumbnail_width - The width of the thumbnail image, in pixels.
  • thumbnail_height - The width of the thumbnail image, in pixels.

Then I looked at the data I'd be getting from the client. It all comes in the form of GET parameters:

  • format - Either json or xml. Personally, I only support json.
  • url - The url to send oEmbed information for.

With all the information close at hand, I spent a happy hour or so writing code, and ended up with a script that outputs something like this:

    "version": "1.0",
    "type": "link",
    "title": "Website Integrations #1: Open Graph",
    "author_name": "Starbeamrainbowlabs",
    "author_url": "https:\/\/\/",
    "provider_name": "Stardust | Starbeamrainbowlabs' Blog",
    "provider_url": "https:\/\/\/blog\/",
    "cache_age": 3600,
    "thumbnail_url": "https:\/\/\/images\/logos\/open-graph.png",
    "thumbnail_width": 300,
    "thumbnail_height": 300

(See it for yourself!)

Though the specification includes requirements for satisfying 2 extra GET parameters, maxwidth and maxheight, I chose to ignore them since writing a dynamic thumbnail rescaling script is both rather complicated and requires a not insignificant amount of processing power every time it is used.

After finishing the oEmbed script, I turned my attention to one final detail: The special <link /> tag required for auto-discovery. A quick bit of PHP in the article page renderer adds something like this to the header:

<link rel="alternate" type="application/json+oembed" href="" />

and with that, my oEmbed provider implementation is complete - but it still needs testing! Unfortunately, testing tool for oEmbed are few and far between, but I did manage to find a few:

  • oEmbed Tester - A basic testing tool. Appears to work well for the most part - except the preview. Not sure why it says "Preview not available." all the time.
  • Iframely URL Debugger - Actually a testing tool for some commercial tool or other, but it still appears to accurately test not only oEmbed, but open graph and twitter cards (more on them in the next post!) too!

After testing and fixing a few bugs, my oEmbed provider was complete! Next time, I'll be taking a look at twitter's take on the subject: Twitter cards.

Found this interesting? Comment below! Share it with a friend!

Website Integrations #1: Open Graph

The logo of the Open Graph protocol.

These days, if you share a link to a website or a blog post with a friend or on a social networking site, sometimes the link expands to a preview of the link you've just posted. Personally, I find this behaviour to be quite helpful, as it lets me get an idea as to what it is that I'm about to click on.

Unfortunately, when it comes to the code behind these previews, there are no less than 3(!) different protocols that you need to implement in order to get it to work, since facebook, twitter, and the rest of the web community haven't been talking to each other quite like they should have been.

Anyway, after implementing these 3 protocols and having a bit of trouble with them, I thought I'd write up a mini-series on the process I went through, the problems I encountered, and how I solved them. In this post, I'm going to explain Facebook's Open Graph protocol.

I decided that I'd implement these 3 protocols on my home page and each blog post's page. Open Graph was the easiest - all it requires is a bunch of meta tags. These tags are split into 2 parts - the common tags, which all page types should have, and the type-specific tags, which depend on the type of page you're implementing them on. Here's the list of common tags I implemented:

  • og:title - The title of your page
  • og:description - A short description of your page
  • og:image, og:image:url, and og:image:secure_url - The url of an image that would fit as a preview for the page
  • og:url - The url of the page (not sure why this is required, since you have to know the url in order to require the page... :P Perhaps it's to help with deduplication - I'm not sure)

These were fairly easy for my home page:

<meta property="og:title" content="Starbeamrainbowlabs" />
<meta property="og:description" content="Hi! I am a computer science student who is in their second year at Hull University. I started out teaching myself about various web technologies, and then I managed to get a place at University, where I am now." />
<meta property="og:image" content="" />
<meta property="og:image:url" content="" />
<meta property="og:image:secure_url" content="" />
<meta property="og:url" content="" />

When I went to test it using Facebook's official testing tool, the biggest problem I had was that the image wouldn't show up - no matter what I did. I eventually found this stackoverflow answer which explained that Facebook doesn't support https urls in anything other than the og:image:secure_url meta tag (even though they say they do) - so changing the urls to regular http solved the problem.

Next, I took a look at the type-specific tags. There's a whole bunch of them (check out this section of the spec) - I decided on the profile type for the index page of my website here:

<meta property="og:type" content="profile" />

The profile type has a few extra specific meta tags that need setting too, so I added those:

<meta property="profile:first_name" content="Starbeamrainbowlabs" />
<meta property="profile:last_name" content="Tjovik" />
<meta property="profile:username" content="Starbeamrainbowlabs" />

With that done, I turned my attention to my blog posts. Since the page is rendered in PHP (and typing out all those meta tags was a rather annoying), I created a teensy little framework to output the meta tags for me

$metaTags = [];
$metaTags["property"] = "value";

$renderedMetaTags = "";
foreach($metaTags as $metaKey => $metaValue)
    $renderedMetaTags .= "\t<meta property=\"$metaKey\" content=\"$metaValue\" />";

Now I can add as many meta tags as I like, with a fraction of the typing - and it looks neater too :D With that done, I implemented the basic meta tags. Here's some example output from the last post:

<meta property="og:title" content="4287 Reasons why your comments weren't posted" />
<meta property="og:description" content="I don't get a lot of real comments on here from what I can tell, as you've probably noticed. I don't particularly mind (though it's always awesome whe.... (click to read more)" />
<meta property="og:image" content="" />
<meta property="og:image:url" content="" />
<meta property="og:image:secure_url" content="" />
<meta property="og:url" content="" />

That wasn't too tough. Next, I looked at the list of types again, and chose the article type for my blog posts.

<meta property="og:type" content="article" />

Like the profile type earlier, the article type also comes with a few type-specific meta tags (what they mean by not fitting into a 'vertical' I have no idea). I decided not to implement all the type-specific meta tags available here, since not all of them were practical to implement. Here's some more example output for the new tags:

<meta property="article:author" content="" />
<meta property="article:published_time" content="2017-04-08T12:56:46+01:00" />

Unfortunately, the article published time is really awkward to get hold of actually (even though it's outputted at the bottom of every article) , so I went with the 'last modified' time instead. The published time is marked up with html microdata Hopefully it doesn't cause too many issues later - though I can always change it :P

With that (and a final test), it looked like my Open Graph implementation was working as intended. Next time, I'll show you how I implemented a simple oEmbed provider.

Useful Links

Pepperminty Wiki Turns 2!

Pepperminty Wiki turns 2(!) :D

2 years ago today, I decided that I'd build a wiki. At the time, I was unsatisfied with all the currently solutions out there - they were either too bulky, old and unmaintained, or just otherwise not quite right. I decided to build something different: An entire wiki in a single file. One single file that you can drop onto your web server and have it just work.

Although I've had to rework and rewrite a few things along the way, development has generally gone ok. I've found that as it's grown, I've needed to change the way I design it slightly - it's been a great learning experience!

In May this year, I managed to get Pepperminty Wiki into Awesome Self Hosted, a list of cool pieces of software that you can host on your own server - a list that has ~12,000 stars on GitHub.

Fast forward to the present, and I find myself with an awesome wiki - that's still all contained in a single file. It's powered by Parsedown. It supports multiple users, page protection, sub pages, full text search (!), customisable themes, tags, redirects, and more! It's also got a whole host of configurable settings system - allowing you to customise how your wiki works to your liking. Best of all, it's got a flexible module based system - so anyone can come along and write a new module to extend it's functionality.

If this sounds like the kind of thing you'd like to use yourself, just head over to the getting your own copy section on it's page (for the lazy, the online downloader is here).

Going forwards, I'd like to a commenting system to let people comment on pages on a wiki. I'd like to add edit previews. I'd like to add a GUI for the settings file. I've got so many ideas that it's difficult to choose which to do next :D

Thank you, everyone for the last 2 years. Here's to another 2 amazing fun filled years! I don't intend to stop development any time soon :D

An easier way to debug PHP

A nice view of some trees taken by Mythdael I think, the awesome person who designed this website :D

Recently at my internship I've been writing quite a bit of PHP. The language itself is OK (I mean it does the job), but it's beginning to feel like a relic of a bygone era - especially when it comes to debugging. Up until recently I've been stuck with using echo() and var_dump() calls all over the place in order to figure out what's going on in my code - that's the equivalent of debugging your C♯ ACW with Console.WriteLine() O.o

Thankfully, whilst looking for an alternative, I found xdebug. Xdebug is like visual studio's debugging tools for C♯ (albeit a more primitive form). They allow you to add breakpoints and step though your PHP code one line at a time - inspecting the contents of variables in both the local and global scope as you go. It improves the standard error messages generated by PHP, too - adding stack traces and colour to the output in order to make it much more readable.

Best of all, I found a plugin for my primary web development editor atom. It's got some simple (ish) instructions on how to set up xdebug too - it didn't take me long to figure out how to put it to use.

I'll assume you've got PHP and Nginx already installed and configured, but this tutorial looks good (just skip over the MySQL section) if you haven't yet got it installed. This should work for other web servers and configurations too, but make sure you know where your php.ini lives.

XDebug consists of 2 components: The PHP extension for the server, and the client that's built into your editor. Firstly, you need to install the server extension. I've recorded an asciicast (terminal recording) to demonstrate the process:

(Above: An asciinema recording demonstrating how to install xdebug. Can't see it? Try viewing it on

After that's done, you should be able to simply install the client for your editor (I use php-debug for atom personally), add a breakpoint, load a php page in your web browser, and start debugging!

If you're having trouble, make sure that your server can talk directly to your local development machine. If you're sitting behind any routers or firewalls, make sure they're configured to allow traffic though on port 9000 and configured to forward it on to your machine.

Art by Mythdael