Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio bash batch blog bookmarklet booting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features event experiment external first impressions future game github github gist graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet io.js jabber javascript js bin labs learning library linux low level lua maintenance manjaro network networking node.js operating systems performance photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit reference release releases resource review rust secrets security series list server software sorting source code control statistics storage svg technical terminal textures three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Securing a Linux Server Part 2: SSH

Wow, it's been a while since I posted something in this series! Last time, I took a look at the Uncomplicated Firewall, and how you can use it to control the traffic coming in (and going out) of your server. This time, I'm going to take a look at steps you can take to secure another vitally important part of most servers: SSH. Used by servers and their administrators across the world to talk to one another, if someone manages to get in who isn't supposed to, they could do all kinds of damage!

The first, and easiest thing we can do it improve security is to prevent the root user logging in. If you haven't done so already, you should create a new user on your server, set a good password, and give it superuser privileges. Login with the new user account, and then edit /etc/ssh/sshd_config, finding the line that says something like

PermitRootLogin yes

....and change it to

PermitRootLogin no

Once done, restart the ssh server. Your config might be slightly different (e.g. it might be PermitRootLogin without-password) - but the principle is the same. This adds an extra barrier to getting into your server, as now attackers must not only guess your password, but your username as well (some won't even bother, and keep trying to login to the root account :P).

Next, we can move SSH to a non-standard port. Some might argue that this isn't a good security measure to take and that it doesn't actually make your server more secure, but I find that it's still a good measure to take for 2 reasons: defence in depth, and preventing excessive CPU load from all the dumb bots that try to get in on the default port. With that, it's make another modification to /etc/ssh/sshd_config. Make sure you test at every step you take, as if you lock yourself out, you'll have a hard time getting back in again....

Port 22

Change 22 in the above to any other number between about 1 and 65535. Next, make sure you've allowed the new port through your firewall! If you're using ufw, my previous post (link above) gives a helpful guide on how to do this. Once done, restart your SSH server again - and try logging in before you close your current session. That way if you make a mistake, you can fix through your existing session.

Once you're confident that you've got it right, you can close port 22 on your firewall.

So we've created a new user account with a secure password (tip: use a password manager if you have trouble remembering it :-)), disabled root login, and moved the ssh port to another port number that's out of the way. Is there anything else we can do? Turns out there is.

Passwords are not the only we can authenticate against an SSH server. Public private keypairs can be used too - and are much more secure - and convenient - than passwords if used correctly. You can generate your own public-private keypair like so:

ssh-keygen -t ed25519

It will ask you a few questions, such as a password to encrypt the private key on disk, and where to save it. Once done, we need to tell ssh to use the new public-private keypair. This is fairly easy to do, actually (though it took me a while to figure out how!). Simply edit ~/.ssh/config (or create it if it doesn't exist), and create (or edit) an entry for your ssh server, making it look something like this:

    Port            {port_name}
    IdentityFile    {path/to/private/keyfile}

It's the IdentityFile line that's important. The port line simply makes it such that you can type ssh (or whatever your server is called) and it will figure out the port number for you.

With a public-private keypair now in use, there's just one step left: disable password-based logins. I'd recommend trailing it for a while to make sure you haven't messed anything up - because once you disable it, if you lose your private key, you won't be getting back in again any time soon!

Again, open /etc/ssh/sshd_config for editing. Find the line that starts with PasswordAuthentication, and comment it out with a hash symbol (#), if it isn't already. Directly below that line, add PasswordAuthentication no.

Once done, restart ssh for a final time, and check it works. If it does, congratulations! You've successfully secured your SSH server (to the best of my knowledge, of course). Got a tip I haven't covered here? Found a mistake? Let me know in a comment below!

Compilers 101: Build your own flex + bison compiler in a few easy(?) steps

So you want to build your own compiler? Great! Don't know where to start? This guide should help! At University, we're building our own compiler for a custom programming language invented by our lecturer with a pair of GNU tools by the name of flex and bison - which I've blogged about before. Since that post, I've learnt a ton about how the whole process works, so I thought I'd write up a more coherent blog post on the subject :P

A diagram explaining the process of building a compiler. Explained below.

Stage 1: Planning

The whole process starts with railroad diagrams (also known as flowcharts) of the language you want to write a compiler for. Having an accurate set of railroad diagrams is essential to understanding precisely how the language is put together, which is rather useful for the next step.

Converting the railroad diagrams into plain BNF (Backus Naur Form). Unfortunately, bison doesn't support EBNF-like notation at the current time, so only plain-old BNF will do.

Stage 2: Lexing

With your railroad diagrams converted into BNF, you can start writing code! The first chunk of code that needs writing is the lexer. Lexing is what flex is good at - and involves converting the input source code into lexemes - discrete sequences of characters that match a particular pattern and can be assigned a particular category name, turning it into a token. Perhaps an example would help. Consider the following:

void do_awesome_stuff(int a, string b) {
    /* Code here */

The above can be turned into a sequence of tokens, not unlike the following (ignoring whitespace tokens, of course):

TYPE: void
IDENTIFIER: do_awesome_stuff
TYPE: int
TYPE: string
COMMENT: /* Code here */*

See? We can identify 8 token types in the source string: TYPE, IDENTIFIER, COMMA, OPEN_BRACKET, CLOSE_BRACKET, OPEN_BRACE, COMMENT, and CLOSE_BRACE. These types and the rules to match them can be found by analysing a combination of the railroad diagrams and the BNF you created earlier.

Stage 3: Parser the first

With a lexer in hand, we can now look at writing the parser. This is done in two stages. The parser itself, and upgrading said parser to generate a parse tree.

Let's talk about the parser first. The parser can be largely created simply by running a few regular-expression find and replace rules on your BNF, actually. From there, it's just a case of adding the header and the footer to complete the document.

Let's take a look at some example BNF:

<instructions> ::= START <lines> END

<lines> ::= <lines> <line>
    | <line>

<line> ::= <command>

<command> ::= <cmd_name> <number>

<cmd_name> ::= FD
    | BK
    | LT
    | RT

<number> ::= <number> <digit>
    | <digit>

<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

The above matches something like the following:

FD 100
RT 180
FD 125
LT 90
BK 50

Very interesting (a virtual cookie is available for anyone who gets the reference as to what this grammar represents!). Let's look at converting that into something bison can understand:

instructions : START lines END

lines : lines line
    | line

line : command

command : cmd_name number

cmd_name : FD
    | BK
    | LT
    | RT

number : number digit
    | digit

digit : 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

That's looking much better already! Simply by using the regular expression substitutions:

  1. <([a-z_]+)> -> $1
  2. ::= -> :
  3. \n\n -> \n\t;\n\n

....we can get most of the way there to something that bison can understand. Next, we need to refactor it a bit to tell it which tokens are coming from the lexer (which I'll leave up to you to write as an exercise), so it doesn't get them confused with the rules - which are defined in the BNF(-like) rules.

Let's get rid of the rule for number and digit first, since we can do those in the lexer quite easily. Next, let's add a %token definition to the top to tell it which are coming from the lexer. It's good practice to define everything that comes from the lexer in uppercase, and everything that's a rule that exists only in bison in lowercase:

%start instructions

We've also defined a start symbol - the one which when bison reaches it, it knows that it's completed the parsing process, as bison is a bottom-up parser.

Lastly, we need to reference the lexer itself. Thankfully that's easy too by appending to your new bison file:


#include "lex.yy.c"

Very nice. Don't forget about the new line at the end of the file - flex and bison will complain if it isn't present! Here's the completed bison file:

%start instructions

instructions : START lines END

lines : lines line
    | line

line : command

command : cmd_name NUMBER

cmd_name : FD
    | BK
    | LT
    | RT


#include "lex.yy.c"

With a brand-new bison file completed, there's just one component of the parser left - a plain-old C file that calls it. Let's create one of those quickly:


int yyparse(void);

int main(void)
    # if YYDEBUG == 1
    extern int yydebug;
    yydebug = 1;

    return yyparse();

void yyerror(char *error_message)
    fprintf(stderr, "Error: %s\nExiting\n", error_message);

The highlighted lines enable a special debugging mode built-in to bison if the standard compile-time symbol YYDEBUG is specified - and bison is run with a few special parameters. Here's the sequence of commands needed to compile this:

flex lexer.l
bison -tv parser.y
gcc -Wall -Wextra -g main.c -lfl -ly -DYYDEBUG -D_XOPEN_SOURCE=700

The gcc command is probably a bit long-winded, but it does several useful things for us:

  • Shows additional warnings just in case we've made a mistake that might be an issue later (-Wall -Wextra)
  • Include additional debugging information in the output file to allow debugging with gdb (the GNU Debugger) if necessary (-g)
  • Fix strange errors on some systems (-D_XOPEN_SOURCE=700)

If you're on a Windows system, you may need to remove the -ly - which appears to be required on the Linux machines I use - it tells gcc that we'll be referencing the bison library.

Stage 4: Parser again

Congratulations on getting this far! You've now got a lexer and a parser - so it's time to put them to use. This is done by utilising the parser to build a parse tree - a tree of nodes that represent the input. Here's an example tree:

An example parse tree.

As you can see, each high-level node references one or more lower-level nodes, and the structure of the tree represents the first 2 lines of the example input above. The nodes in yellow are lexical tokens that come directly from flex - these are called terminals, or leaf nodes. The ones in purple come from the bison rules (which we derived from the BNF we wrote at the beginning of this post) - and are non-terminals, or tree nodes.

With this in mind, let's introduce another feature or two of bison. Firstly, let's take a look at revising that %token declaration we created above:

%token<val_num> NUMBER

The important bit here is the <val_num>. Here we tell bison that a value should be attached to the token NUMBER - and that it will be of type int. After telling bison that it should expect a value, we need to give it a place to put it. Let's write some more code to go just below the %token declarations:

%union {
    int val_num;

There we go! Excellent - we've got a place to put it. Now we just need to alter the lexer to convert the token value to an int and put it there. That's not too tough, thankfully - but if you're having trouble with it, here's a hint:

{number}        { yylval.val_num = atoi(yytext); return(NUMBER); }

Now we have it passing numbers correctly, let's look briefly at generating that parse tree. I'm not going to give the game away - just a few helpful hints as to what you need to do here - otherwise it's not as fun :P

Generating the parse tree can be considered the both the most challenging part of the experience (especially if you don't know what you're doing) and the easiest to deal with at same time. Knowing your stuff and your end goal before you start makes the whole process a lot easier.

The first major step is to create a struct that can represent a type of node in your parse tree. It might be useful to store several properties here - such as the node type (An enum will come in handy here), a spot for the value of a lexical token (or a reference to it in a symbol table if you have one), and references to child nodes in the parse tree.

The second major step of the process is to create a utility method that creates a new node of the tree on the heap, and then revise the bison file to get each rule to create new nodes on the tree in such a way that it creates a parse tree when it reaches start node (or top node of the tree - which, in the case of the above, is instructions). For the purposes of this post, I'll be using a method with this prototype:

TreeNode create_node(int item, int node_type, TreeNode left, TreeNode right);

Your tree node struct and subsequent creation method may vary. With this in hand, we can revise the bison rules we created above to create these nodes we've been talking about. Here's a quick pointer on how to revise the rule for command above:

command : cmd_name NUMBER   { $$ = create_node($2, NODE_COMMAND, $1, NULL); }

This might look a bit strange, but let's break it down. The bit in curly braces is some (almost) plain C code that creates the node and returns a pointer to it to bison. The $$ is the return value for that node - which, I might add reminds me of something I forgot above. We need to tell bison about our new tree node data type and which rules should return it:

%type<val_tnode> instructions lines line command cmd_name

/* And in %union { ... } ..... */
TreeNode val_tnode;

This is almost the same as the %token<val_num> we did before, but we're defining the return value of a rule this type - not a token. With that little interlude out of the way, let's return to the code above. $1 and $2 refer to the first and second items in the rule definition respectively - and hold the type that we defined above in the %token and %type directives. Since bison is a bottom-up parser - this means that by the time this code executes, all it's child nodes have (hopefully) been created - and we just have to tie them all up together with a new node. In the case of my little example above, $1 is of type TreeNode, and $2 is of type int (that is if I didn't make any mistakes further up!).

Stage 5: Blasting off to code generation and beyond

Phew! That's a lot of work. If you've read this far, thank you and well done! It's been a long journey for both you the reader and me the writer, but you're almost done.

While conceptually simple when broken down, the whole process actually gets rather involved - especially when writing the BNF and the parser (the latter of which can be a particular pain due to shift/reduce and reduce/reduce errors), and the amount of code and head scratching you've got to do to get to this point is enormous. My best advice is to take it slow and don't rush - you'll only cause most problems for yourself if you try and jump the gun. Make sure that each stage works as you intend before you continue - back-pedalling to fix an issue can be particularly annoying as it can be bothersome to work out which stage the bug is actually in.

The last step of the whole process is to actually do something with the parse tree we've worked so hard to create. Thankfully, that's not too difficult - as we can put some additional code in the { } block of the starting symbol to call methods that will do things like perform some optimisation, print the tree to the console - or generate some sweet code. While the actual generation of code is beyond the scope of this article, I may end up posting about some optimisation techniques you can use on a parse tree after I've finished fiddling with float handling, symbol tables, and initial code generation in my ACW (Assessed Course Work).

Found this useful? Found a bug in this post? Got a suggestion? Comment below! Since I don't have any real analytics on this blog besides the server logs, I've no idea if you've read it really unless you comment :P

Sources and Further Reading

TeleConsole Client is available on NuGet!

Some cool-looking white circuits on a blue background from the NuGet website. (Above: Some cool-looking circuits that feature on the NuGet website)

Hey! After a large amount of research, I've finally figured out how to create a simple NuGet package. Since I ended up using TeleConsole in a recent ACW and it doesn't have any dependencies (making packaging easier!), I decided to use it to test the system.

I think it's turned out rather well actually - you can find it on NuGet here.

Since it's been such a complicated process, rather than talking about TeleConsole itself, I'd like to give a sort-of tutorial on how I did it instead (if you'd like to read more about TeleConsole, I posted about it when I first released it here).

To start with, you'll need to install NuGet. Once done, the next step is to create a .nuspec file for your project. It goes in the same directory as the .csproj file for the project you want to publish on NuGet. I haven't yet figured out how to reference another project in the same solution and have it work with NuGet, but I should imagine it's mostly automatic. Here's the .nuspec file for TeleConsole:

<?xml version="1.0"?>
    <releaseNotes>Initial nuget release.</releaseNotes>
    <copyright>Copyright 2017</copyright>
    <tags>Debugging Networking Console Remote</tags>

As you can see, it's actually a fairly simple format - based on XML of course, since C♯ seems to love it for some reason. The bits in $ signs are special - they are references to the corresponding values in the .csproj file. You can do it for <version> too - but I was experiencing issues with it not picking this up correctly, so I'm specifying it manually. More information about this can be found on the Microsoft website - links are available at the bottom of this post.

With that done, the next step is to package it into a `.nupkg file 0 which is basically just a renamed .zip file with a specific structure. The nuget command-line application has the ability to do this for us - but doesn't work on Linux without an extra argument (thanks to @ArtOfSettling on GitHub for discovering this!):

nuget pack -MsbuildPath /usr/lib/mono/msbuild/15.0/bin/

...Windows users don't need to include the -MsbuildPath /usr/lib/mono/msbuild/15.0/bin/ bit. This command will output a .nupkg file, which you can then upload to here, once you're signed in.

Sources and Further Reading

Deep dive: Email, Trust, DKIM, SPF, and more

Lots of parcels (Above: Lots of parcels. Hopefully you won't get this many through the door at once..... Source)

Now that I'm on holiday, I've got some time to write a few blog posts! As I've promised a few people a post on the email system, that's what I'll look at this this post. I'm going to take you on a deep dive through the email system and trust. We'll be journeying though the fields of DKIM signatures, and climb the SPF mountain. We'll also investigate why the internet needs to take this journey in the first place, and look at some of the challenges one faces when setting up their own mail server.

Hang on to your hats, ladies and gentlemen! If you get to the end, give yourself a virtual cookie :D

Before we start though, I'd like to mention that I'll be coming at this from the perspective of my own email server that I set up myself. Let me introduce to you the cast: Postfix (the SMTP MTA), Dovecot (the IMAP MDA), rspamd (the spam filter), and OpenDKIM (the thing that deals with DKIM signatures).

With that out of the way, let's begin! We'll start of our journey by mapping out the journey a typical email undertakes.

The path a typical email takes. See the explanation below.

Let's say Bob Kerman wants to send Bill an email. Here's what happens:

  1. Bill writes the email and hits send. His email client connects to his email server, logs in, and asks the server to deliver a message for him.
  2. The server takes the email and reads the From header (in this case it's, figures out where the mail server is located, connects to it, and asks it to deliver Bob's message to Bill. takes the email and files it in Bill's inbox.
  3. Bill connects to his mail server and retrieves Bob's message.

Of course, this is simplified in several places. will obviously need to do a few DNS lookups to find's mail server and fiddle with the headers of Bob's message a bit (such as adding a Received header etc.), and won't just accept the message for delivery without checking out the server it came from first. How does it check though? What's preventing pretending to be and sending an imposter?

Until relatively recently, the answer was, well, nothing really. Anyone could send an email to anyone else without having to prove that they could indeed send email in the name of a domain. Try it out for yourself by telnetting to a mail server on port 25 (unencrypted SMTP) and trying in something like this:

MAIL From: <>

Hello! This is a email to remind you.....

Oh, my! Frank at can connect to any mail server and pretend that is sending a message to! Mail servers that allow this are called open relays, and today they usually find themselves on several blacklists within minutes. Ploys like these are easy to foil, thankfully (by only accepting mail for your own domains), but it still leaves the problem of what to do about random people connecting to your mail server delivering spam to your inbox that claims to be from someone they aren't supposed to be sending mail for.

In response, some mail servers demanded things like the IP that connects to send an email must reverse to the domain name that they want to send email from. Clever, but when you remember that anyone can change their own PTR records, you realise that it's just a minor annoyance to the determined spammer, and another hurdle to the legitimate person in setting up their own mail server!

Clearly, a better solution is needed. Time to introduce our first destination: SPF. SPF stands for sender policy framework, and defines a mechanism by which a mail server can determine which IP addresses a domain allows mail to be sent from in it's name. It's a TXT record that sites at the root of a domain. It looks something like this:

v=spf1 a mx ptr ip4: ip6:2001:41d0:e:74b::1 -all

The above is my SPF TXT record for It's quite simple, really - let's break it down.


This just defines the version of the SPF standard. There's only one version so far, so we include this to state that this record is an SPF version 1 record.

a mx ptr

This says that the domain that the sender claims to be from must have an a and an mx record that matches the IP address that's sending the email. It also says that the ptr record associated with the sender's IP must resolve to the domain the sender claims to be sending from, as described above (it does help with dealing with infected machines and such).

ip4: ip6:2001:41d0:e:74b::1

This bit says that the IP addresses and 2001:41d0:e:74d::1 are explicitly allowed to send mail in the name of

After all of the above, this bit isn't strictly necessary, but it says that all the IP addresses found in the a records for and are allowed to send mail in the name of


Lastly, this says that if you're not on the list, then your message should be rejected! Other variants on this include ~all (which says "put it in the spam box instead"), and +all (which says "accept it anyway", though I can't see how that's useful :P).

As you can see, SPF allows a mail server to verify if a given client is indeed allowed to send an email in the name of any particular domain name. For a while, this worked a treat - until a new problem arose.

Many of the mail servers on the internet don't (and probably still don't!) support encryption when connecting to and delivering mail, as certificates were expensive and difficult to get hold of (nowadays we've got LetsEncrypt who give out certificates for free!). The encryption used when mail servers connect to one another is practically identical to that used in HTTPS - so if done correctly, the identity of the remote server can be verified and the emails exchanged encrypted, if the world's certification authorities aren't corrupted, of course.

Since most emails weren't encrypted when in transit, a new problem arose: man-in-the-middle attacks, whereby an email is altered by one or more servers in the delivery chain. Thinking about it - this could still happen today even with encryption, if any one server along an email's route is compromised. To this end, another mechanism was desperately needed - one that would allow the receiving mail server to verify that an email's content / headers hadn't been surreptitiously altered since it left the origin mail server - potentially preventing awkward misunderstandings.

Enter stage left: DKIM! DKIM stands for Domain Keys Identified Mail - which, in short, means that it provides a method by which a receiving mail server can cryptographically prove that a message hasn't been altered during transit.

It works by having a public-private keypair, in which the public key can only decrypt things, but the private key is capable of encrypting things. A hash of the email's headers / content is computed and encrypted with the private key. Then the encrypted hash is attached to the email in the DKIM-Signature header.

The receiving mail server does a DNS lookup to find the public key, and decrypts the hash. It then computes it's own hash of the email headers / content, and compares it against the decrypted hash. If it matches, then the email hasn't been fiddled with along the way!

Of course, not all the headers in the email are hashed - only a specific subset are included in the hash, since some headers (like Received and X-Spam-Result) are added and altered during transit. If you're interested in implementing DKIM yourself - DigitalOcean have a smashing tutorial on the subject, which should adapt easily to whatever system you're running yourself.

With both of those in place,'s mail server can now verify that is allowed to send the email on behalf of, and that the message content hasn't been tampered with since it left can also catch in the act of trying to deliver spam from!

There is, however, one last piece of the puzzle left to reveal. With all this in place, how do you know if your mail was actually delivered? Is it possible to roll SPF and DKIM out gradually so that you can be sure you've done it correctly? This can be a particular issue for businesses and larger email server setups.

This is where DMARC comes in. It's a standard that lets you specify an email address you'd like to receive DMARC reports at, which contain statistics as to how many messages receiving mail servers got that claimed to be from you, and what they did with them. It also lets you specify what percentage of messages should be subject to DMARC filtering, so you can roll everything out slowly. Finally, it lets you specify what should happen to messages that fail either SPF, DKIM, or both - whether they should be allowed anyway (for testing purposes), quarantined, or rejected.

DMARC policies get specified (yep, you guessed it!) in a DNS record. unlike SPF though, they go in as a TXT record, substituting for your domain name. Here's an example:

v=DMARC1; p=none;

This is just a simple example - you can get much more complex ones than this! Let's go through it step by step.


Nothing to see here - just a version number as in SPF.


This is the policy of what should happen to messages that fail. In this example we've used none, so messages that fail will still pass right on through. You can set it to quarantine or even reject as you gain confidence in your setup.

This specifies where you want DMARC reports to be sent. Each mail server that receives mail from your mail server will bundle up statistics and send them once a day to this address. The format is in XML (which won't be particularly easy to read), but there are free DMARC record parsers out there on the internet that you can use to decode the reports, like dmarcian.

That completes the puzzle. If you're still reading, then congratulations! Post in the comments and say hi :D We've climbed the SPF mountain and discovered how email servers validate who is allowed to send mail in the name of another domain. We've visited the DKIM signature fields and seen how the content of email can be checked to see if it's been altered during transit. Lastly, we took a stroll down DMARC lane to see how it's possible to be sure what other servers are doing with your mail, and how a large email server setup can implement DMARC, DKIM, and SPF more easily.

Of course, I'm not perfect - if there's something I've missed or got wrong, please let me know! I'll try to correct it as soon as possible.

Lastly, this is, as always, a starting point - not an ending point. An introduction if you will - it's up to you to research each technology more thoroughly - especially if you're thinking of implementing them yourself. I'll leave my sources at the bottom of this post if you'd like somewhere to start looking :-)

Sources and Further Reading

Semi-automated backups with duplicity and an external drive

A bunch of hard drives. (Above: A bunch of hard drives. The original can be found here.)

Since I've recently got myself a raspberry pi to act as a server, I naturally needed a way to back it up. Not seeing anything completely to my tastes, I ended up putting something together that did the job for me. For this I used an external hard drive, duplicity, sendxmpp (sudo apt install sendxmpp), and a bit of bash.

Since it's gone rather well for me so far, I thought I'd write a blog post on how I did it. It still needs some tidying up, of course - but it works in it's current state, and perhaps it will help someone else put together their own system!

Step 1: Configuring the XMPP server

I use XMPP as my primary instant messaging server, so it's only natural that I'd want to integrate the system in with it to remind me when to plug in the external drive, and so that it can tell me when it's done and what happened. Since I use prosody as my XMPP server, I can execute the following on the server:

sudo prosodyctl adduser

...and then enter a random password for the new account. From there, I set up a new private persistent multi-user chatroom for the messages to filter into, and set my client to always notify when a message is posted.

After that, it was a case of creating a new config file in a format that sendxmpp will understand: thesecurepassword

Step 2: Finding the id of the drive partition

With the XMPP side of things configured, next I needed a way to detect if the drie was plugged in or not. Thankfully all partitions have a unique id built-in, which you can use to see if it's plugged in or not. It's easy to find, too:

sudo blkid

The above will list all available partitions and their UUID - the unique id I mentioned. With that in hand, we can now check if it's plugged in or not with a cleverly crafted use of the readlink command:

readlink /dev/disk/by-uuid/${partition_uuid} 1>/dev/null 2>&2;
if [[ "${partition_found}" -eq "0" ]]; then
    echo "It's plugged in!";
    echo "It's not plugged in :-(";

Simple, right? readlink has an exit code of 0 if it managed to read the symbolik link in /dev/disk/by-uuid ok, and 1 if it didn't. The symbolic links in /deve/disk/by-uuid are helpfuly created automatically for us :D From here, we can take it a step further to wait until the drive is plugged in:

# Wait until the drive is available
while true
    readlink "${partition_uuid}";

    if [[ "$?" -eq 0 ]]; then

    sleep 1;

Step 3: Mounting and unmounting the drive

Raspberry Pis don't mount drive automatically, so we'll have do that ourselves. Thankfully, it's not so tough:

# Create the fodler to mount the drive into
mkdir -p ${backup_drive_mount_point};
# Mount it in read-write mode
mount "/dev/disk/by-uuid/${partition_uuid}" "${backup_drive_mount_point}" -o rw;

# Do backup thingy here

# Sync changes to disk
# Unmount the drive
umount "${backup_drive_mount_point}";

Make sure you've got the ntfs-3g package installed if you want to back up to an NTFS volume (Raspberry Pis don't come with it by default!).

Step 4: Backup all teh things!

There are more steps involved in getting to this point than I thought there were, but if you've made it this far, than congrats! Have a virtual cookie :D 🍪

The next part is what you probably came here for: duplicity itself. I've had an interesting time getting this to work so far, actually. It's probably easier if I show you the duplicity commands I came up with first.

# Create the archive & temporary directories
mkdir -p /mnt/data_drive/.duplicity/{archives,tmp}/{os,data_drive}
# Do a new backup
PASSPHRASE=${encryption_password} duplicity --full-if-older-than 2M --archive-dir /mnt/data_drive/.duplicity/archives/os --tempdir /mnt/data_drive/.duplicity/tmp/os --exclude /proc --exclude /sys --exclude /tmp --exclude /dev --exclude /mnt --exclude /var/cache --exclude /var/tmp --exclude /var/backups / file://${backup_drive_mount_point}/duplicity-backups/os/
PASSPHRASE=${data_drive_encryption_password} duplicity --full-if-older-than 2M --archive-dir /mnt/data_drive/.duplicity/archives/data_drive --tempdir /mnt/data_drive/.duplicity/tmp/data_drive /mnt/data_drive --exclude '**.duplicity/**' file://${backup_drive_mount_point}/duplicity-backups/data_drive/

# Remove old backups
PASSPHRASE=${encryption_password} duplicity remove-older-than 6M --force --archive-dir /mnt/data_drive/.duplicity/archives/os file:///${backup_drive_mount_point}/duplicity-backups/os/
PASSPHRASE=${data_drive_encryption_password} duplicity remove-older-than 6M --force --archive-dir /mnt/data_drive/.duplicity/archives/data_drive file:///${backup_drive_mount_point}/duplicity-backups/data_drive/

Path names have been altered for privacy reasons. The first duplicity command in the above was fairly straight forward - backup everything, except a few folders with cache files / temporary / weird stuff in them (like /proc).

I ended up having to specify the archive and temporary directories here to be on another disk because the Raspberry Pi I'm running this on has a rather... limited capacity on it's internal micro SD card, so the default location for both isn't a good idea.

The second duplicity call is a little more complicated. It backs up the data disk I have attached to my Raspberry Pi to the external drive I've got plugged in that we're backing up to. The awkward bit comes when you realise that the archive and temporary directories are located on this same data-disk that we're trying to back up. To this end, I eventually found (through lots of fiddling) that you can exclude a folder duplicity via the --exclude '**.duplicity/**' syntax. I've no idea why it's different when you're not backing up the root of the filesystem, but it is (--exclude ./.duplicity/ didn't work, and neither did /mnt/data_drive/.duplicity/).

The final two duplicity calls just clean up and remove old backups that are older than 6 months, so that the drive doesn't fill up too much :-)

Step 5: What? Where? Who?

We've almost got every piece of the puzzle, but there's still one left: letting us know what's going on! This is a piece of cake in comparison to the above:

function xmpp_notify {
        echo $1 | sendxmpp --file "${xmpp_config_file}" --resource "${xmpp_resource}" --tls --chatroom "${xmpp_target_chatroom}"

Easy! All we have to do is point sendxmpp at our config file we created waaay in step #1, and tell it where the chatroom is that we'd like it to post messages in. With that, we can put all the pieces of the puzzle together:

#!/usr/bin/env bash

source .backup-settings

function xmpp_notify {
    echo $1 | sendxmpp --file "${xmpp_config_file}" --resource "${xmpp_resource}" --tls --chatroom "${xmpp_target_chatroom}"

xmpp_notify "Waiting for the backup disk to be plugged in.";

# Wait until the drive is available
while true
    readlink "${backup_drive_dev}";

    if [[ "$?" -eq 0 ]]; then

    sleep 1;

xmpp_notify "Backup disk detected - mounting";

mkdir -p ${backup_drive_mount_point};

mount "${backup_drive_dev}" "${backup_drive_mount_point}" -o rw

xmpp_notify "Mounting complete - performing backup";

# Create the archive & temporary directories
mkdir -p /mnt/data_drive/.duplicity/{archives,tmp}/{os,data_drive}

echo '--- Root Filesystem ---' >/tmp/backup-status.txt
# Create the archive & temporary directories
mkdir -p /mnt/data_drive/.duplicity/{archives,tmp}/{os,data_drive}
# Do a new backup
PASSPHRASE=${encryption_password} duplicity --full-if-older-than 2M --archive-dir /mnt/data_drive/.duplicity/archives/os --tempdir /mnt/data_drive/.duplicity/tmp/os --exclude /proc --exclude /sys --exclude /tmp --exclude /dev --exclude /mnt --exclude /var/cache --exclude /var/tmp --exclude /var/backups / file://${backup_drive_mount_point}/duplicity-backups/os/ 2>&1 >>/tmp/backup-status.txt
echo '--- Data Disk ---' >>/tmp/backup-status.txt
PASSPHRASE=${data_drive_encryption_password} duplicity --full-if-older-than 2M --archive-dir /mnt/data_drive/.duplicity/archives/data_drive --tempdir /mnt/data_drive/.duplicity/tmp/data_drive /mnt/data_drive --exclude '**.duplicity/**' file://${backup_drive_mount_point}/duplicity-backups/data_drive/ 2>&1 >>/tmp/backup-status.txt

xmpp_notify "Backup complete!"
cat /tmp/backup-status.txt | sendxmpp --file "${xmpp_config_file}" --resource "${xmpp_resource}" --tls --chatroom "${xmpp_target_chatroom}"
rm /tmp/backup-status.txt

xmpp_notify "Performing cleanup."

PASSPHRASE=${encryption_password} duplicity remove-older-than 6M --force --archive-dir /mnt/data_drive/.duplicity/archives/os file:///${backup_drive_mount_point}/duplicity-backups/os/
PASSPHRASE=${data_drive_encryption_password} duplicity remove-older-than 6M --force --archive-dir /mnt/data_drive/.duplicity/archives/data_drive file:///${backup_drive_mount_point}/duplicity-backups/data_drive/

umount "${backup_drive_mount_point}";

xmpp_notify "Done! Backup completed. You can now remove the backup disk."

I've tweaked a few of the pieces to get them to work better together, and created a separate .backup-settings file to store all the settings in.

That completes my backup script! Found this useful? Got an improvement? Use a different strategy? Post a comment below!

An (unscientific) Introduction to I2C

I've recently bought an LCD display for a project. Since I don't have many pins to play with, I ended up buying an I2C-driven display to cut the data pins down to just 2: One for outgoing messages, and one for receiving incoming messages from other devices.

It's taken me some time to get to grips with the idea of I2C, so I thought I'd write up what I've learnt so far here, along with some helpful hints if you run into problems yourself.

In effect, I2C is a wire protocol that allows multiple devices to talk to each other over a single pair of cables. Every I2C device has an 8 bit hardware address burned into it that it uses to address itself - much like the Internet Protocol when it comes to it, actually. Devices can send messages to one another using these addresses - though not all at the same time, obviously!

If you want to talk directly over I2C with a device, then Wire.h is the library you want to use. Normally though, devices will come with their own library that utilises Wire.h and communicates with it for you.

As a good first test to see if I2C is working, I found an I2C scanner that scans for connected devices. Since the address space is so limited, it doesn't take long at all:

/* --------------------------------------
 * i2c_scanner
 * Version 1
 *  This program (or code that looks like it)
 *  can be found in many places.
 *  For example on the forum.
 *  The original author is not know.
 * Version 2, Juni 2012, Using Arduino 1.0.1
 *  Adapted to be as simple as possible by user Krodal
 * Version 3, Feb 26  2013
 *  V3 by louarnold
 * Version 4, March 3, 2013, Using Arduino 1.0.3
 *  by user Krodal.
 *  Changes by louarnold removed.
 *  Scanning addresses changed from 0...127 to 1...119,
 *  according to the i2c scanner by Nick Gammon
 * Version 5, March 28, 2013
 *  As version 4, but address scans now to 127.
 *  A sensor seems to use address 120.
 * Version 6, November 27, 2015.
 *  Added waiting for the Leonardo serial communication.
 * This sketch tests the standard 7-bit addresses
 * Devices with higher bit address might not be seen properly.

#include <Wire.h>

void setup()

    while (!Serial);             // Leonardo: wait for serial monitor
    Serial.println("\nI2C Scanner");

void loop()
    byte error, address;
    int nDevices;


    nDevices = 0;
    for(address = 1; address < 127; address++ )
        // The i2c_scanner uses the return value of
        // the Write.endTransmission to see if
        // a device did acknowledge to the address.
        error = Wire.endTransmission();

        if (error == 0)
            Serial.print("I2C device found at address 0x");
            if (address<16)
            Serial.println("  !");

        else if (error==4)
            Serial.print("Unknown error at address 0x");
            if (address<16)
    if (nDevices == 0)
    Serial.println("No I2C devices found\n");

    delay(5000);           // wait 5 seconds for next scan

As the initial comment mentions, I can't claim ownership of this code! I got it from here.

With the code in mind, it's time to look at the circuit design.

A simple I2C circuit connecting an Arduino Uno v3 and an LCD display.

(Above: A simple I2C circuit. Credits go to for the images.)

The above connects an Arduino Uno version 3 with a simple LCD display via a breadboard to allow for expansion to connect future devices. The power (the red and blue cables) link the 5V and GND pins from the Arduino to the appropriate pins on the back of the LCD (the image of an LCD I found didn't have the pins showing :P), and the I2C pins (green and yellow) connect the SDA and SCL pins on the Arduino to the LCD display.

With the circuit done, that completes the system! All that remains now is to build something cool with the components we've put together :D

The final product of the above IRL!

Access your home linux box from anywhere with SSH tunnels

An abstract tunnel that doesn't hold much relevant to the blog post :P

(Header by GDJ from Source page)

....and other things! Recently, I bought a Raspberry Pi 3. Now that the rest of the components have arrived, I've got a rather nice little home server that's got a 1 terabyte WD PiDrive attached to it to provide lots of lovely shared storage, which is rather nice.

However, within a few weeks I was faced with a problem. How do I access my new box to configure it from my internship when I'm on lunch? Faced with such a challenge, I did what anyone would, and took to the internet to find a solution.

It didn't take long. A while ago I heard about these things called 'SSH tunnels', which, while not designed for a high throughput, are more than adequate for a low-intensity SSH connection that runs a few kilobytes a second in either direction. After reading this excellent answer by erik on the Unix & Linux StackExchange, I had an understanding of how SSH tunnels work, and was ready to put together a solution. You should go and read that answer if you'd like to understand SSH tunnels too - it explains it much better than I ever could :P

With that knowledge in hand, I went about planning the SSH tunnel. I already have a server a public IP address (it's hosting this website!), so I needed a reverse tunnel to allow me to access a port local to my linux box at home (called elessar - a virtual cookie for anyone who gets the reference!) from

Important! Ask yourself whether it's moral and ethical to set up an ssh tunnel before you think about following along with this article! If you find yourself behind a firewall or something similar, then the chances are that it's there for a good reason - and you might get into trouble if you try and circumvent it. I won't be held responsible for any loss or damages of any description caused by the reading of this post.

First job: create a limited account on for elessar to SSH into. That's easy:

sudo useradd --system ssh-tunnel

Then, with a few quick lines in /etc/ssh/sshd_config:

Match User ssh-tunnel
    ForceCommand echo 'This account can only be used for ssh tunnelling.'

....we can prevent the ssh-tunnel user from being abused to gain shell access to the server (let me know if there are any further measures I can put in place here).

Now that I had a user account to ssh in as, I could set up a public / private keypair to authenticate with, and cook up an SSH command for elessar that would set up the appropriate tunnel. After fiddling around a bit, I came up with this that did the job:

ssh -TN -R30582:localhost:5724

Very cool. So with that command executing on elessar, I could ssh into elessar from! In short, it sets up a tunnel that will make port 30582 on tunnel through to port 5724 on elessar - the port on elessar that has SSH running on it, without allocating a pseudo-tty to save resources. can, well, explain it in more detail if you're interested.

Having an SSH command that would set up the tunnel is nice, but it's not very useful, since I have to execute it first before I can actually SSH into elessar from afar.

The solution was actually a little bit complicated. First, I wrote a simple systemd service file (systemd is what I have installed, since it's vanilla raspbian - this should be easily adaptable to other systems and setups) to start the SSH tunnel automagically on boot:

Description=SSH tunnel from to local ssh server.

ExecStart=/usr/bin/ssh -TN -R30582:localhost:5724


I quickly realised that there were a few flaws with this approach. Firstly, it tried to start the SSH connection before my router had connected to the internet, since my router starts faster than the box that initialises the fibre connection to my ISP. Secondly, it fails to retry when the connection dies.

The first problem can be solved relatively easily, by wrapping the ssh command in a clever bit of shell scripting:

/bin/sh -c 'until ping -c1 &>/dev/null && sleep 5; do :; done && /usr/bin/ssh -TN -R30582:localhost:5724

The above tries to ping every 5 seconds until it succeeds, and only then does it attempt to open the SSH connection. This solves the first problem. To solve the second, we need to look at autossh. Autossh is a small tool that monitors an ssh connection in a variety of configurable ways and restarts the connection if ever dies for whatever reason. You can install it with your favourite package manager:

sudo apt install autossh

Substitute apt with whatever package manager you use on your system. With it installed, we can use a command like this:

autossh -o "UserKnownHostsFile /home/ssh-tunnel/.ssh/known_hosts" -o "IdentityFile /home/ssh-tunnel/.ssh/ssh-tunnel_ed25519" -o "PubkeyAuthentication=yes" -o "PasswordAuthentication=no" -o "ServerAliveInterval 900" -TN -R30582:localhost:5724 -p 7261

to automatically start our ssh tunnel, and restart it if anything goes wrong. Note all the extra settings I had to specify here. This is because even though I had many of them specified in ~/.ssh/config for the ssh-tunnel user, because of systemd's weird environment when it starts a service, I found I had to specify everything in the command line with absolute paths (ugh).

Basically, the above tells autossh where the known_hosts file is (important for automation!), that it should only attempt public / private keypair authentication and not password authentication, that it should check the server's still there every 15 minutes, and all the other things we figured out above.

Finally, I combined the solutions I came up with for both problems, which left me with this:

Description=SSH tunnel from to local ssh server.

ExecStart=/bin/sh -c 'until ping -c1 &>/dev/null && sleep 5; do :; done && /usr/bin/autossh -o "UserKnownHostsFile /home/pi/.ssh/known_hosts" -o "IdentityFile /home/pi/.ssh/ssh-tunnel_ed25519" -o "PubkeyAuthentication=yes" -o "PasswordAuthentication=no" -o "ServerAliveInterval 900" -TN -R30582:localhost:5724 -p 7261'


Here's a version that utilises the -f parameter of autossh to put the autossh into the background, which eliminates the sh parent process:

Description=SSH tunnel from to local ssh server.

ExecStartPre=/bin/mkdir -p /var/run/sbrl-ssh-tunnel
ExecStartPre=-/bin/chown ssh-tunnel:ssh-tunnel /var/run/sbrl-ssh-tunnel
ExecStart=/bin/sh -c 'until ping -c1 &>/dev/null && sleep 5; do :; done && /usr/bin/autossh -f -o "UserKnownHostsFile /home/pi/.ssh/known_hosts" -o "IdentityFile /home/pi/.ssh/ssh-tunnel_ed25519" -o "PubkeyAuthentication=yes" -o "PasswordAuthentication=no" -o "ServerAliveInterval 900" -TN -R30582:localhost:5724 -p 7261'


I ended up further modifying the above to set up an additional tunnel to allow elessar to send emails via the postfix email server that's running on Let me know if you'd be interested in a tutorial on this!

Sources and Futher Reading

How to set up a WebDav share with Nginx

I've just been setting up a WebDav share on a raspberry pi 3 for my local network (long story), and since it was a bit of a pain to set up (and I had to combine a bunch of different tutorials out there to make mine work), I thought I'd share how I did it here.

I'll assume you have a raspberry pi all set up and up-to-date in headless mode, with a ufw for your firewall (if you need help with this, post in the comments below or check out the Raspberry Pi Stack Exchange). To start with, we need to install the nginx-full package:

sudo apt update
sudo apt install  nginx-full

Note that we need the nginx-full package here, because the nginx-extras or just simply nginx packages don't include the required additional webdav support modules. Next, we need to configure Nginx. Nginx's configuration files live at /etc/nginx/nginx.conf, and in /etc/nginx/conf.d. I did something like this for my nginx.conf:

user www-data;
worker_processes 4;
pid /run/;

events {
    worker_connections 768;
    # multi_accept on;

http {

    # Basic Settings

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    # server_tokens off;

    # server_names_hash_bucket_size 64;
    # server_name_in_redirect off;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # SSL Settings

    ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
    ssl_prefer_server_ciphers on;

    # Logging Settings

    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log;

    # Gzip Settings

    gzip on;

    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Virtual Host Configs

    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;

Not many changes here. Then, I created a file called 0-webdav.conf in the conf.d directory, and this is what I put in it:

server {
    listen 80;
    listen [::]:80;


    auth_basic              realm_name;
    auth_basic_user_file    /etc/nginx/.passwords.list;

    dav_methods     PUT DELETE MKCOL COPY MOVE;
    dav_ext_methods PROPFIND OPTIONS;
    dav_access      user:rw group:rw all:r

    client_body_temp_path   /tmp/nginx/client-bodies;
    client_max_body_size    0;
    create_full_put_path    on;

    root /mnt/hydroplans;

Now this is where the magic happens. The dav_access directive tells nginx to allow everyone to read, but only logged in users to write to the share. This isn't actually particularly relevant, because of the auth_basic and auth_basic_user_file directives, which tell nginx to require people to login to the share before they are allowed to access it.

It's also important to note that I found that Windows (10, at least), didn't like the basic authentication - even though Ubuntu's Nautilus accepted it just fine - so I had to comment that bit out :-(

If you do still want authentication (hey! May you'll have better luck than I :P), then you'll need to set up the passwords file. Here's how you create it:

echo -n 'helen:' | sudo tee /etc/nginx/.passwords.list
openssl passwd -apr1 | sudo tee -a /etc/nginx/.passwords.list 

The above creates a user called helen, and asks you to type a password. If you're adding another user to the file, simply change the first tee to be tee -a to avoid overwriting the first one.

With that all configured, it's time to test the configuration file, and, if we're lucky, restart nginx!

sudo nginx -t
sudo systemctl restart nginx

That's all you should need to do to set up a simple WebDav share. Remember that this is a starting point, and not an ending point - there are a few big holes in the above that you'll need to address, depending on your use case (for example, I haven't included the setup of https / encryption - try letsencrypt for that).

Here are the connection details for the above for a few different clients:

  • Ubuntu / Nautilus: (Go to "Other Locations" and paste this into the "Connect to Server" box) dav://
  • Windows: (Go to "Map Network Drive" and paste this in)

Did this work for you? Have any problems? Got instructions for a WebDav client not listed here? Let me know in the comments!

Website Integrations #3: Twitter cards

The twitter logo. (The posts featured in the above images are this one about my new Raspberry Pi 3, and the latest coding conundrums evolved post).

You have arrived in the third of three parts in my mini-series on how I implemented rich snippets. In the last two parts I tackled open graph and becoming an oEmbed provider. In this part, I'll be talking a bit about twitter cards, and how I implemented them.

Twitter's take on the problem seems to be much simpler than Facebook's, which makes for easy implementing :D Like in the other two protocols too, they decided to have multiple different types of, well, in this case, cards. I decided to implement the summary card type. Like open graph, it adds a bunch of <meta /> tags to the header. Sigh. Anyway, here are the property names I needed to implement:

  • twitter:card - The type of card. In my case this is set to summary
  • twitter:site - This one's confusing. Although it's called 'site', it should actually be set to your own twitter handle - mine is @SBRLabs.
  • twitter:title - The title of the content. Practically identical to open graph's og:title.
  • twitter:description - The description of the content. The same as og:description.
  • twitter:image - A url pointing to an image that should be displayed next to the title and description. Unlike Facebook's open graph, twitter appears to support https urls here with no problem at all.

Since after implementing open graph I already had 90% of the infrastructure and calculations in place already, throwing together values for the above wasn't too difficult. Here's an example set of twitter card <meta /> tags generated by the updated code:

<meta property="twitter:card" content="summary" />
<meta property="twitter:site" content="@SBRLabs" />
<meta property="twitter:title" content="Running Prolog on Linux" />
<meta property="twitter:description" content="Hello! I hope you had a nice restful Easter. I've been a bit busy this last 6 months, but I've got a holiday at the moment, and I've just received a .... (click to read more)" />
<meta property="twitter:image" content="" />

Easy peasy. Next up was testing time. Thankfully, Twitter made this easy too by providing an official testing tool. Interestingly, they whitelist domains based on whether the webmaster has run a url through their tool - so if you want twitter cards to show up, make sure you plug at least one of your website's page urls through their tool.

After a few tweaks, I got this:

The whitelisted message from the twitter card tester.

With that, my work was complete. This brings us to the end of my mini-series on rich-snippet integrations (unless I've missed a protocol O.o Comment below if I have)! I hope you've found it useful. If you have (or even if you haven't!) please let me know in the comments below :D

Profiling PHP with XDebug

(This post is a fork of a draft version of a tutorial / guide originally written as a whilst at my internship.)

The PHP and xdebug logos.

Since I've been looking into xdebug's profiling function recently, I've just been tasked with writing up a guide on how to set it up and use it, from start to finish - and I thought I'd share it here.

While I've written about xdebug before in my An easier way to debug PHP post, I didn't end up covering the profiling function - I had difficulty getting it to work properly. I've managed to get it working now - this post documents how I did it. While this is written for a standard Debian server, the instructions can easily be applied to other servers.

For the uninitiated, xdebug is an extension to PHP that aids in the debugging of PHP code. It consists of 2 parts: The php extension on the server, and a client built into your editor. With these 2 parts, you can create breakpoints, step through code and more - though these functions are not the focus of this post.

To start off, you need to install xdebug. SSH into your web server with a sudo-capable account (or just use root, though that's bad practice!), and run the following command:

sudo apt install php-debug

Windows users will need to download it from here and put it in their PHP extension direction. Users of other linux distributions and windows may need to enable xdebug in their php.ini file manually (windows users will need extension=xdebug.dll; linux systems use instead).

Once done, xdebug should be loaded and working correctly. You can verify this by looking the php information page. To see this page, put the following in a php file and request it in your browser:


If it's been enabled correctly, you should see something like this somewhere on the resulting page:

Part of the php information page, showing the xdebug section.

With xdebug setup, we can now begin configuring it. Xdebug gets configured in php.ini, PHP's main configuration file. Under Virtualmin each user has their own php.ini because PHP is loaded via CGI, and it's usually located at ~/etc/php.ini. To find it on your system, check the php information page as described above - there should be a row with the name "Loaded Configuration File":

The 'loaded configuration file' directive on the php information page.

Once you've located your php.ini file, open it in your favourite editor (or type sensible-editor php.ini if you want to edit over SSH), and put something like this at the bottom:



Obviously, you'll want to customise the above. The xdebug.profiler_enable_trigger_value directive defines a secret key we'll use later to turn profiling on. If nothing else, make sure you change this! Profiling slows everything down a lot, and could easily bring your whole server down if this secret key falls into the wrong hands (that said, simply having xdebug loaded in the first place slows things down too, even if you're not using it - so you may want to set up a separate server for development work that has xdebug installed if you haven't already). If you're not sure on what to set it to, here's a bit of bash I used to generate my random password:

dd if=/dev/urandom bs=8 count=4 status=none | base64 |  tr -d '=' | tr '+/' '-_'

The xdebug.profiler_output_dir lets you change the folder that xdebug saves the profiling output files to - make sure that the folder you specify here is writable by the user that PHP is executing as. If you've got a lot of profiling to do, you may want to consider changing the output filename, since xdebug uses a rather unhelpful filename by default. The property you want to change here is xdebug.profiler_output_name - and it supports a number of special % substitutions, which are documented here. I can recommend something phpprofile.%t-%u.%p-%H.%R.%cachegrind - it includes a timestamp and the request uri for identification purposes, while still sorting chronologically. Remember that xdebug will overwrite the output file if you don't include something that differentiates it from request to request!

With the configuration done, we can now move on to actually profiling something :D This is actually quite simple. Simply add the XDEBUG_PROFILE GET (or POST!) parameter to the url that you want to test in your browser. Here are some examples:


Adding this parameter to a request will cause xdebug to profile that request, and spit out a cachegrind file according to the settings we configured above. This file can then be analysed in your favourite editor, or, if it doesn't have support, an external program like qcachegrind (Windows) or kcachegrind (Everyone else).

If you need to profile just a single AJAX request or similar, most browsers' developer tools let you copy a request as a curl or wget command (Chromium-based browsers, Firefox - has an 'edit and resend' option), allowing you to resend the request with the XDEBUG_PROFILE GET parameter.

If you need to profile everything - including all subrequests (only those that pass through PHP, of course) - then you can set the XDEBUG_PROFILE parameter as a cookie instead, and it will cause profiling to be enabled for everything on the domain you set it on. Here's a [bookmarklet]() that set the cookie:

javascript:(function(){document.cookie='XDEBUG_PROFILE='+'insert_secret_key_here'+';expires=Mon, 05 Jul 2100 00:00:00 GMT;path=/;';})();


Replace insert_secret_key_here with the secret key you created for the xdebug.profiler_enable_trigger_value property in your php.ini file above, create a new bookmark in your browser, paste it in (making sure that your browser doesn't auto-remove the javascript: at the beginning), and then click on it when you want to enable profiling.

Was this helpful? Got any questions? Let me know in the comments below!

Sources and further reading

Art by Mythdael