Archive

## Tag Cloud

3d account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems performance photos php pixelbot portable privacy problem solving programming problems projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg technical terminal textures three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

## Bridging the gap between XMPP and shell scripts

In a previous post, I set up a semi-automated backup system for my Raspberry Pi using duplicity, sendxmpp, and an external drive. It's been working fabulously for a while now, but unfortunately the other week sendxmpp suddenly stopped working with no obvious explanation. Given the long list of arguments I had to pass it:

sendxmpp --file "${xmpp_config_file}" --resource "${xmpp_resource}" --tls --chatroom "${xmpp_target_chatroom}" ........... ....and the fact that I've had to tweak said arguments on a number of occasions, I thought it was time to switch it out for something better suited to the task at hand. Unfortunately, finding such a tool proved to be a challenge. I even asked on Reddit - but nobody had anything that fit the bill (xmpp-bridge wouldn't compile correctly - and didn't support multi-user chatrooms anyway, and xmpppy was broken too). If you're unsure as to what XMPP is, I'd recommend checkout out either this or this tutorial. They both give a great introduction to what it is, what it does, and how it works - and the rest of this post will make much more sense if you read that first :-) To this end, I finally gave in and wrote my own tool, which I've called xmppbridge. It's a global Node.JS script that uses the simple-xmpp to forward the standard input to a given JID over XMPP - which can optionally be a group chat. In this post, I'm going to look at how I put it together, some of the issues I ran into along the way, and how I solved them. If you're interested in how to install and use it, then the package page on npm will tell you everything you need to know: xmppbridge on npm ### Architectural Overview The script consists of 3 files: • index.sh - Calls the main script with ES6 modules enabled • index.mjs - Parses the command-line arguments and environment variables out, and provides a nice CLI • XmppBridge.mjs - The bit that actually captures input from stdin and sends it via XMPP Let's look at each of these in turn - starting with the command-line interface. ### CLI Parsing The CLI itself is relatively simple - and follows a paradigm I've used extensively in C♯ (although somewhat modified of course to get it to work in Node.JS, and without fancy ANSI colouring etc.). #!/usr/bin/env node "use strict"; import XmppBridge from './XmppBridge.mjs'; const settings = { jid: process.env.XMPP_JID, destination_jid: null, is_destination_groupchat: false, password: process.env.XMPP_PASSWORD }; let extras = []; // The first arg is the script name itself for(let i = 1; i < process.argv.length; i++) { if(!process.argv[i].startsWith("-")) { extras.push(process.argv[i]); continue; } switch(process.argv[i]) { case "-h": case "--help": // ........ break; // ........ default: console.error(Error: Unknown argument '${process.argv[i]}'.);
process.exit(2);
break;
}
}

We start with a shebang, telling Linux-based systems to execute the script with Node.JS. Following that, we import the XmppBridge class that's located in XmppBrdige.mjs (we'll come back to this later). Then, we define an object to hold our settings - and pull in the environment variables along with defining some defaults for other parameters.

With that setup, we can then parse the command-line arguments themselves - using the exact same paradigm I've used time and time again in C♯.

Once the command-line arguments are parsed, we validate the final settings to ensure that the user hasn't left any required parameters undefined:

for(let environment_varable of ["XMPP_JID", "XMPP_PASSWORD"]) {
if(typeof process.env[environment_varable] == "undefined") {
console.error(Error: The environment variable ${environment_varable} wasn't found.); process.exit(1); } } if(typeof settings.destination_jid != "string") { console.error("Error: No destination jid specified."); process.exit(5); } That's basically all that index.mjs does. All that's really left is passing the parameters to an instance of XmppBridge: const bridge = new XmppBridge( settings.destination_jid, settings.is_destination_groupchat ); bridge.start(settings.jid, settings.password); ### Shebang Trouble Because I've used ES6 modules here, currently Node must be informed of this via the --experimental-modules CLI argument like this: node --experimental-modules ./index.mjs If we're going to make this a global command-line tool via the bin directive in package.json, then we're going to have to ensure that this flag gets passed to Node and not our program. While we could alter the shebang, that comes with the awkward problem that not all systems (in fact relatively few) support using both env and passing arguments. For example, this: #!/usr/bin/env node --experimental-modules Wouldn't work, because env doesn't recognise that --experimental-modules is actually a command-line argument and not part of the binary name that it should search for. I did see some Linux systems support env -S to enable this functionality, but it's hardly portable and doesn't even appear to work all the time anyway - so we'll have to look for another solution. Another way we could do it is by dropping the env entirely. We could do this: #!/usr/local/bin/node --experimental-modules ...which would work fine on my system, but probably not on anyone else's if they haven't installed Node to the same place. Sadly, we'll have to throw this option out the window too. We've still got some tricks up our sleeve though - namely writing a bash wrapper script that will call node telling it to execute index.mjs with the correct arguments. After a little bit of fiddling, I came up with this: #!/usr/bin/env bash install_dir="$(dirname "$(readlink -f$0)")";
exec node --experimental-modules "${install_dir}/index.mjs"$@

2 things are at play here. Firstly, we have to deduce where the currently executing script actually lies - as npm uses a symbolic link to allow a global command-line tool to be 'found'. Said symbolic link gets put in /usr/local/bin/ (which is, by default, in the user's PATH), and links to where the script is actually installed to.

To figure out the directory that we've been installed to is (and hence the location of index.mjs), we need to dereference the symbolic link and strip the index.sh filename away. This can be done with a combination of readlink -f (dereferences the symbolic link), dirname (get the parent directory of a given file path), and $0 (holds the path to the currently executing script in most circumstances) - which, in the case of the above, gets put into the install_dir variable. The other issue is passing all the existing command-line arguments to index.mjs unchanged. We do this with a combination of $@ (which refers to all the arguments passed to this script except the script name itself) and exec (which replaces the currently executing process with a new one - in this case it replaces the bash shell with node).

This approach let's us customise the CLI arguments, while still providing global access to our script. Here's an extract from xmppbridge's package.json showing how I specify that I want index.sh to be a global script:

{
.....

"bin": {
"xmppbridge": "./index.sh"
},

.....
}

### Bridging the Gap

Now that we've got Node calling our script correctly and the arguments parsed out, we can actually bridge the gap. This is as simple as some glue code between simple-xmpp and readline. simple-xmpp is an npm package that makes programmatic XMPP interaction fairly trivial (though I did have to look at examples in the GitHub repository to figure out how to send a message to a multi-user chatroom).

readline is a Node built-in that allows us to read the standard input line-by-line. It does other things too (and is great for interactive scripts amongst other things), but that's a tale for another time.

The first task is to create a new class for this to live in:

"use strict";

import xmpp from 'simple-xmpp';

class XmppBridge {

/**
* Creates a new XmppBridge instance.
* @param   {string}    in_destination_jid  The JID to send stdin to.
* @param   {Boolean}   in_is_groupchat     Whether the destination JID is a group chat or not.
*/
constructor(in_destination_jid, in_is_groupchat) {
// ....
}
}

export default XmppBridge;

Very cool! That was easy. Next, we need to store those arguments and connect to the XMPP server in the constructor:

this.destination_jid = in_destination_jid;
this.is_destination_groupchat = in_is_groupchat;

this.client = xmpp;
this.client.on("online", this.on_connect.bind(this));
this.client.on("error", this.on_error.bind(this));
this.client.on("chat", ((_from, _message) => {
// noop
}).bind(this));

I ended up having to define a chat event handler - even though it's pointless, as I ran into a nasty crash if I didn't do so (I suspect that this use-case wasn't considered by the original package developer).

The next area of interest is that online event handler. Note that I've bound the method to the current this context - this is important, as it would be able to access the class instance's properties otherwise. Let's take a look at the code for that handler:

console.log([XmppBridge] Connected as ${data.jid}.); if(this.is_destination_groupchat) { this.client.join(${this.destination_jid}/bot_{data.jid.user}); } this.stdin = readline.createInterface({ input: process.stdin, output: process.stdout, terminal: false }); this.stdin.on("line", this.on_line_handler.bind(this)); this.stdin.on("close", this.on_stdin_close_handler.bind(this)); This is the point at which we open the standard input and start listening for things to send. We don't do it earlier, as we don't want to end up in a situation where we try sending something before we're connected! If we're supposed to be sending to a multi-user chatroom, this is also the point at which it joins said room. This is required as you can't send a message to a room that you haven't joined. The resource (the bit after the forward slash /), for a group chat, specifies the nickname that you want to give to yourself when joining. Here, I automatically set this to the user part of the JID that we used to login prefixed with bot_. The connection itself is established in the start method: start(jid, password) { this.client.connect({ jid, password }); } And every time we receive a line of input, we execute the send() method: on_line_handler(line_text) { this.send(line_text); } I used a full method here, as initially I had some issues and wanted to debug which methods were being called. That send method looks like this: send(message) { this.client.send( this.destination_jid, message, this.is_destination_groupchat ); } The last event handler worth mentioning is the close event handler on the readline interface: on_stdin_close_handler() { this.client.disconnect(); } This just disconnects from the XMXPP server so that Node can exit cleanly. That basically completes the script. In total, the entire XmppBridge.mjs class file is 72 lines. Not bad going! You can install this tool for yourself with sudo npm install -g xmppbridge. I've documented how it use it in the README, so I'd recommend heading over there if you're interested in trying it out. Found this interesting? Got a cool use for XMPP? Comment below! ### Sources and Further Reading ## Setup your very own VPN in 10 minutes flat Hey! Happy new year :-) I've been looking to setup a personal VPN for a while, and the other week I discovered a rather brilliant project called PiVPN, which greatly simplifies the process of setting one up - and managing it thereafter. It's been working rather well so far, so I thought I'd post about it so you can set one up for yourself too. But first though, we should look at the why. Why a VPN? What does it do? Basically, a VPN let you punch a great big hole in the network that you're connected to and appear as if you're actually on a network elsewhere. The extent to which this is the case varies depending on the purpose, (for example a University or business might setup a VPN that allows members to access internal resources, but doesn't route all traffic through the VPN), but the general principle is the same. It's best explained with a diagram. Imagine you're at a Café: Everyone on the Café's WiFi can see the internet traffic you're sending out. If any of it is unencrypted, then they can additionally see the content of said traffic - e.g. emails you send, web pages you load, etc. Even if it's encrypted, statistical analysis can reveal which websites you're visiting and more. If you don't trust a network that you're connected to, then by utilising a VPN you can create an encrypted tunnel to another location that you do trust: Then, all that the other users of the Café's WiFi will see is an encrypted stream of packets - all heading for the same destination. All they'll know is roughly how much traffic you're sending and receiving, but not to where. This is the primary reason that I'd like my own VPN. I trust the network I've got setup in my own house, so it stands to reason that I'd like to setup a VPN server there, and pretend that my devices when I'm out and about are still at home. In theory, I should be able to access the resources on my home network too when I'm using such a VPN - which is an added bonus. Other reasons do exist for using a VPN, but I won't discuss them here. In terms of VPN server software, I've done a fair amount of research into the different options available. My main criteria are as follows: • Fairly easy to install • Easy to understand what it's doing once installed (transparency) • Easy to manage The 2 main technologies I came across were OpenVPN and IPSec. Each has their own strengths & weaknesses. An IPSec VPN is, apparently, more efficient - especially since it executes on the client in kernel-space instead of user-space. It's a lighter protocol, too - leading to less overhead. It's also much more likely to be detected and blocked when travelling through strict firewalls, making me slightly unsure about it. OpenVPN, on the other hand, executes entirely in user-space on both the client and the server - leading to a slightly greater overhead (especially with the mitigations for the recent Spectre & Meltdown hardware bugs). It does, however, use TLS (though over UDP by default). This characteristic makes it much more likely it'll slip through stricter firewalls. I'm unsure if that's a quality that I'm actually after or not. Ultimately, it's the ease of management that points the way to my final choice. Looking into it, with both choices there's complex certificate management to be done whenever you want to add a new client to the VPN. For example, with StrongSwan (an open-source IPSec VPN program), you've got to generate a number of certificates with a chain of rather long commands - and the users themselves have passwords stored in plain text in a file! While I've got no problem with reading and understanding such commands, I do have a problem with rememberability. If I want to add a new client, how easy is that to do? How long would I have to spend re-reading documentation to figure out how to do it? Sure, I could write a program to manage the configuration files for me, but that would also require maintenance - and probably take much longer than I anticipate to write. I forget where I found it, but it is for this reason that I ultimately decided to choose PiVPN. It's a set of scripts that sets up and manages one's an OpenVPN installation. To this end, it provides a single command - pivpn - that can be used to add, remove, and list clients and their statistics. With a concise help text, it makes it easy to figure out how to perform common tasks utilising existing terminal skills by conforming to established CLI interface norms. If you want to install it yourself, then simply do this: curl -L https://install.pivpn.io | bash Of course, simply downloading and executing a random script from the Internet is never a good idea. Let's read it first: curl -L https://install.pivpn.io | less Once you're happy that it's not going to do anything malign to your system, proceed with the installation by executing the 1st command. It should guide you through a number of screens. Some important points I ran into: • The static IP address it talks about is the IP address of your server on the local network. The installation asks about the public IP address in a later step. If you've already got a static IP setup on your server (and you probably have), then you don't need to worry about this. • It asks you to install and enable unattended-upgrades. You should probably do this, but I ended up skipping this - as I've already got apticron setup and sending me regular emails - as I rather like to babysit the upgrade of packages on the main machines I manage. I might look into unattended-upgrades in the future if I acquire more servers than are comfortable to manage this way. • Make sure you fully update your system before running the installation. I use this command: sudo apt update && sudo apt-get dist-upgrade && sudo apt-get autoclean && sudo apt-get autoremove • Changing the port of the VPN isn't a bad idea, since PiVPN will automatically assemble .ovpn configuration files for you. I didn't end up doing this to start with, but I can always change it in the NAT rule I configured on my router later. • Don't forget to allow OpenVPN through your firewall! For ufw users (like me), then it's something like sudo ufw allow <port_number>/udp. • Don't forget to setup a NAT rule / port forwarding on your router if said server doesn't have a public IP address (if it's IPv4 it probably doesn't). If you're confused on this point, comment below and I'll blog about it. It's..... a complicated topic. If you'd like a more in-depth guide to setting up PiVPN, then I can recommend this guide. It's a little bit dated (PiVPN now uses elliptical-curve cryptography by default), but still serves to illustrate the process pretty well. If you're confused about some of the concepts I've presented here - leave a comment below! I'm happy to explain them in more detail. Who knows - I might end up writing another blog post on the subject.... ## Enabling ANSI Escape Codes on Windows 10 In a piece of assessed coursework (ACW) I've done recently, I built a text-based user interface rendering engine. Great fun, but when I went to run it on Windows - all I got was garbage in the console window! I found this strange, since support was announced a year or 2 back. They've even got an extensive documentation page on the subject! (Above: ANSI escape sequences rendering on Windows. Hooray! Source: This forum post on the Intel® Developer Zone Forums) The problem lay in the fact that unlike Linux, you actually need to enable it by calling an unmanaged function in the Windows API and flipping an undocumented bit. Sounds terrible? It is. Thankfully, due to the .NET runtime be it Microsoft's official implementation or Mono handles references to DLLs, it's fairly easy to write a method that flips the appropriate bit in a portable fashion, which I'd like to document in this blog post for future reference. Firstly, let's setup a method that only executes on Windows. That's easily achieved by checking Environment.OSVersion: if(Environment.OSVersion.Platform.ToString().ToLower().Contains("win")) { IConsoleConfigurer configurer = new WindowsConsoleConfiguerer(); configurer.SetupConsole(); } Here, we inspect the platform we're running on, and if it contains the substring win, then we can assume that we're on Windows. Then, in order to keep the unmanaged code calls as loosely coupled and as far from the main program as possible, I've put bit-flipping code itself in a separate class and referenced it via an interface. This is probably overkill, but at least this way if I run into any further compilation issues it won't be too difficult to refactor it into a separate class library and load it via reflection. Said interface needs only a single method: internal interface IConsoleConfigurer { void SetupConsole(); } ....I've marked this as internal, as it's not (currently) going to be used outside the assembly it's located in. If that changes in the future, I can always mark it public instead. The implementation of this interface is somewhat more complicated. Here it is: /// <summary> /// Configures the console correctly so that it processes ANSI escape sequences. /// </summary> internal class WindowsConsoleConfiguerer : IConsoleConfigurer { const int STD_OUTPUT_HANDLE = -11; const uint ENABLE_VIRTUAL_TERMINAL_PROCESSING = 4; [DllImport("kernel32.dll", SetLastError = true)] static extern IntPtr GetStdHandle(int nStdHandle); [DllImport("kernel32.dll")] static extern bool GetConsoleMode(IntPtr hConsoleHandle, out uint lpMode); [DllImport("kernel32.dll")] static extern bool SetConsoleMode(IntPtr hConsoleHandle, uint dwMode); public void SetupConsole() { IntPtr handle = GetStdHandle(STD_OUTPUT_HANDLE); uint mode; GetConsoleMode(handle, out mode); mode |= ENABLE_VIRTUAL_TERMINAL_PROCESSING; SetConsoleMode(handle, mode); } } In short, the DllImport attributes and the extern keywords tell the .NET runtime that the method should be imported directly from a native DLL - not a .NET assembly. The SetupConsole() method, that's defined by the IConsoleConfigurer interface above, then calls the native methods that we've imported - and because the .NET runtime only imports DLLs when they are first utilised, it compiles and runs just fine on Linux too :D Found this helpful? Still having issues? Got a better way of doing this? Comment below! ## Question: How do you recover a deleted file that's been overwritten? Answer: With the greatest of difficulty. The blog post following this one in a few days time is, ironically, about backing things up. However, I actually ended up losing the entire post during the upload process to my server (it replaced both the source and destination files with an empty file!). I'd already saved it to disk, but still almost lost it anyway...... Recovery of deleted files is awkward at best. It relies on the fact that when you 'delete' something, it doesn't erase it from disk at all - just deallocates the sectors on disk that it was taking up and re-enters them into free pool of space - which is then re-used at will. The most important thing to remember when you've just lost something is to not touch anything. Shutdown your computer, and, if you're not confident enough yourself, call someone who knows what they're doing to help you out. The best way to recover a file is to boot into a live cd. This is a CD (or flash drive) that holds an (or multiple!) entire operating system(s). This way, no additional writing is done to the disk containing the deleted file - potentially corrupting it. After fiddling about with this (I had to update my bootable flash drive, as Ubuntu 15.10 is out of support and I couldn't download the extundelete tool, whichI'll mention shortly), I found that I'd hit a dead-end. I was using the extundelete (sudo apt install extundelete, apt) tool, and it claimed that it couldn't restore the file because it had been reallocated. Here's the command I used: sudo extundelete --restore-file /absolute/path/to/file /dev/sda7 I suspect that it was getting confused because I had a file by that name on disk that was now empty. Anyway, after doing something else for a while, I had an idea. Since my blog posts are just text files on disk, shouldn't it be on my disk somewhere? Could I locate it at all? As it turns out, the answer is yes. Remembering a short sentence from the post I'd just written, I started a brute-force search of my disk: sudo dd if=/dev/sda7 | strings | grep -i "AWS S3" This has several components to it Explain Shell is great at providing an explanation of each bit in turn. Here's a short summary: • dd - This reads in the entire contents of a partition and pushes it into the following command. Find the partition name with lsblk. • strings - This extracts all runs of printable characters from the input stream. • grep - This searches (case-insensitively with -i) for an specified string in the input I started to get results - a whole line from the blog post that had supposedly been deleted and overwritten! This wasn't really enough though. Taking a longer snippet to reduce the noise in the output, I tried again: sudo dd if=/dev/sda7 | strings | grep -i -C100 "To start, we'll need an AWS S3 bucket" This time, I added -C100. This tells grep that I want to see 100 lines before and after any lines that contain the specified search string. With this, I managed to recover enough of the blog post to quickly re-edit and upload it. It did appear to remove blank lines and the back-ticks at the end of a code block, but they are easy to replace. Note to self: Always copy first when crossing file system boundaries, and delete later. Don't move all in one go! ## How to set up a shared PDF printer on your local network I've recently ended up setting up a PDF printer on my local network in an effort to transfer some pictures out of a ridiculous i-device (I tell you, Apple'e iOS is the worst for being a walled garden). Since the process for doing so wasn't entirely obvious, I'm documenting it in this blog post to remind myself for later. If you find it useful, please let me know in the comments below! Firstly, you'll need a machine running Linux. Any distribution will do, but I'll be using an apt-based distribution, so you may need to alter some of the commands here to suit your system. Firstly, we need install the cups (which stands for the Common Unix Printing Service) PDF printer driver. It comes with a lot of junk if you're not careful, so here I use --no-install-recommends to avoid installing any unnecessary packages. sudo apt install printer-driver-cups-pdf --no-install-recommends If you've got a firewall running (which you really should - see this post of mine for more information on that), then you'll need to open the port 631 for TCP traffic to allow people to print. If you're using ufw, then this should do the trick: sudo ufw allow cups If not, then you may need to specify the port number explicitly: sudo ufw allow 631/tcp With the printer installed, we next need to open it to the world. Before that though, we should make some changes to the configuration file, which is located at /etc/cups-pdf.conf. Firstly, I wanted to put the resulting PDFs into my file server's shared folder. This is achieved by editing the Out and AnonDirName settings. They should already be present in the configuration file - it's just a matter of changing their values: Out /absolute/path/to/output/dir AnonDirName /absolute/path/to/output/dir I also wanted to customise the user account and permissions that it saves the pdfs with. I did this through the AnonUser and AnonUMask settings - which should also be present by default: AnonUser username AnonUMask 0007 The umask is basically an inverted permission octal. I found a good calculator calculator online to do it for me :P (Don't forget the preceding 0 - it's important!) Finally, I experienced an issue whereby cups kept overwriting the same file again and again because the iPad wasn't smart enough to send the photos to print with their actual filenames - instead opting to send them all as Photo.pdf. Thankfully though, cups-pdf has the Label option (also specified by default) that ensures that output filenames don't clash. Setting it to 1 instead of 0 solved the problem for me: Label 1 Note that some of these properties may be prefixed with a hash (#). You'll need to remove this in order for it to take effect. With the new PDF printer configured, it's time to open it up to our local network. Here's how to do that: sudo cupsctl --share-printers sudo lpadmin -p pdf -o printer-is-shared=true Note that if you want to open it up to more than your local subnet you'll need to do some additional configuration - such as configuring authentication, for instance. Such things are beyond the scope of this blog post, but if there's the demand (comment below!) I can certainly investigate writing something up. Found this useful? Got a better / different solution? Comment below! ## Job Scheduling on Linux Scheduling jobs to happen at a later time on a Linux based machine can be somewhat confusing. Confused by 5 4 8-10/4 6/4 * baffled by 5 */4 * * *? All will be revealed! ### cron Scheduling jobs on a Linux machine can be done in several ways. Let's start with cron - the primary program that orchestrates the whole proceeding. Its name comes from the Greek word Chronos, which means time. By filling in a crontab (read cron-table), you can tell it what to do when. It's essentially a time-table of jobs you'd like it to run. Your Linux machine should come with cron installed already. You can check if cron is installed and running by entering this command into your terminal: if [[ "(pgrep -c cron)" -gt 0 ]]; then echo "Cron is installed :D"; else echo "Cron is not installed :-("; fi

If it isn't installed or running, then you'll have to investigate why this isn't the case. The most common is that it isn't installed. It's normally in the official repositories for most distributions - on Debian-based system sudo apt install cron should suffice. Arch-based users may need to check to make sure that the system service is enabled and do so manually.

With cron setup and ready to go, we can start adding jobs to it. This is done by way of a crontab, as explained above. Each user has their own crontab such that they can each configure their own individual sets jobs. To edit it, type this:

crontab -e

This will open your favourite editor with your crontab ready for editing (if you'd like to change your editor, do sudo update-alternatives --config editor or change the EDITOR environment variable). You should see a bunch of lines like this:

# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command

I'd advise you keep this for future reference - just in case you find yourself in a pinch later - so scroll down to the bottom and start adding your jobs there.

Let's look at the syntax for telling cron about a job next. This is best done by example:

0 1 * * 7   cd /root && /root/run-backup

This job, as you might have guessed, runs a custom backup script. It's one I wrote myself, but that's a story for another time (comment below if you'd like me to post about that). What we're interested in is the bit at the beginning: 0 1 * * 7. Scheduling a cron job is done by specifying 5 space-separated values. In the case of the above, the job will run at 1am every Sunday morning. The order is as follows:

• Minute
• Hour
• Day of the Month
• Month
• Day of the week

For of these values, a number of different specifiers can be used. For example, specifying an asterisk (*) will cause the job to run at every interval of that column - e.g. every minute or every hour. If you want to run something on every minute of the day (such as a logging or monitoring script), use * * * * *. Be aware of the system resources you can use up by doing that though!

Specifying number will restrict it to a specific time in an interval. For example, 10 * * * * will run the job at 10 minutes past every hour, and 22 3 * * * will run a job at 03:22 in the morning every day (I find such times great for maintenance jobs).

Sometimes, every hour or every minute is too often. Cron can handle this too! For example 3 */2 * * * will run a job at 3 minutes past every second hour. You can alter this at your leisure: The value after the forward slash (/) decides the interval (i.e. */3 would be every third, */15 would be every 15th, etc.).

The last column, the day of the week, is an alternative to the day of the month column. It lets you specify, as you may assume, the day oft he week a job should run on. This can be specified in 2 way: With the numbers 0-6, or with 3-letter short codes such as MON or SAT. For example, 6 20 * * WED runs at 6 minutes past 8 in the evening on Wednesday, and 0 */4 * * 0 runs every 4th hour on a Sunday.

The combinations are endless! Since it can be a bit confusing combining all the options to get what you want, crontab.guru is great for piecing cron-job specifications together. It describes your cron-job spec in plain English for you as you type!

(Above: crontab.guru displaying a random cronjob spec)

### What if I turn my computer off?

Ok, so cron is all very well, but what if you turn your machine off? Well, if cron isn't running at the time a job should be run, then it won't get executed. For those of us who don't leave their laptops on all the time, all is not lost! It's time to introduce the second piece of software at our disposal.

Enter stage left: anacron. Built to be a complement to cron, anacron sets up 3 folders:

• /etc/cron.daily
• /etc/cron.weekly
• /etc/cron.monthly

Any executable scripts in this folder will be run at daily, weekly, and monthly intervals respectively by anacron, and it respects the hash-bang (that #! line at the beginning of the script) too!

Most server systems do not come with anacron pre-installed, though it should be present if your distributions official repositories. Once you've installed it, edit root's crontab (with sudo crontab -e if you can't remember how) and add a job that executes anacron every hour like so:

# Run anacron every hour
5 * * * *   /usr/sbin/anacron

This is important, as anacron does not in itself run all the time like cron does (this behaviour is called a daemon in the Linux world) - it needs a helping hand to get it to run.

If you've got more specific requirements, then anacron also has it's own configuration file you can edit. It's found at /etc/anacrontab, and has a different syntax. In the anacron table, jobs follow the following pattern:

• period - The interval, in days, that the job should run
• delay - The offset, in minutes, that the job should run at
• job identifier - A textual identifier (without spaces, of course) that identifies the job
• command - The command that should be executed

You'll notice that there are 3 jobs specified already - one for each of the 3 folders mentioned above. You can specify your own jobs too. Here's an example:

# Do the weekly backup
7   20  run-backup  cd /root/data-shape-backup && ./do-backup;

The above job runs every 7 days, with an offset of 20 minutes. Note that I've included a command (the line starting with a hash #) to remind myself as to what the job does - I'd recommend you always include such a comment for your own reference - whether you're using cron, anacron, or otherwise.

I'd also recommend that you test your anacron configuration file after editing it to ensure it's valid. This is done like so:

anacron -T

#### I'm not an administrator, can I still use this?

Sure you can! If you've got anacron installed (you could even compile it from source locally if you haven't) and want to specify some jobs for your local account, then that's easily done too. Just create an anacrontab file anywhere you please, and then in your regular crontab (crontab -e), tell anacron where you put it like this:

# Run anacron every hour
5 * * * *   /usr/sbin/anacron -t "path/to/anacrontab"

Good point. cron and anacron are great for repeating jobs, but what if you want to set up a one-off job to auto-disable your firewall before enabling it just in case you accidentally lock yourself out? Thankfully, there's even an answer for this use-case too: atd.

atd is similar to cron in that it runs a daemon in the background, but instead of executing jobs specified in a crontab, you tell it when you want it to execute a series of commands, and then enter the commands themselves. For example:

$at now + 10 minutes warning: commands will be executed using /bin/sh at> echo -e "Testing" at> uptime at> <EOT> job 4 at Thu Jul 12 14:36:00 2018 In the above, I tell it to run the job 10 minutes from now, and enter a pair of commands. To end the command list, I hit CTRL + D on an empty line. The output of the job will be emailed to me automatically if I've got that set up (cron and anacron also do this). Specifying a time can be somewhat fiddly, but its also quite flexible: • at tomorrow • at now + 5 hours • at 16:06 • at next month • at 2018 09 25 ....and so on. Listing the current scheduled jobs is also just as easy: atq This will output a list of scheduled jobs that haven't been run yet. You can't see any jobs that aren't created by you unless you're root (use sudo), though. You can use the job ids listed here to cancel a job too: # Remove job id 4: atrm 4 ### Conclusion That just about concludes this whirlwind tour of job scheduling on Linux systems. We've looked at how to schedule jobs with cron, and how to ensure our jobs get run - even when the target machine isn't turned on all the time with anacron. We've also looked at one-time jobs with atd, and how to manage the job queue. As usual, this is a starting point - not an ending point! Job scheduling is just the beginning. From here, you can look at setting up automated backups. You could investigate setting up an email server, and how that integrates with cron. You can utilise cron to perform maintenance for your next great web (or other!) application. The possibilities are endless! Found this useful? Still confused? Comment below! ## Read / Write Disk Performance Testing in Bash Recently I needed to quickly (and non-destructively) test the read / write performance of a flash drive of mine. Naturally, I turned my attention to my terminal. This post is me documenting what I did so that I can remember for next time :P Firstly, to test the speed of a disk, we need some data to test with. Since lots of small files will inevitably cause slowdowns due to the overhead of writing the file metadata and inode information to the superblock, it makes the most sense to use one gigantic file rather than tons of small ones. Here's what I did to generate a 1 Gigabyte file filled with zeroes: dd if=/dev/zero of=/tmp/testfile.bin bs=1M count=1024 Cool. Next, we need to copy it to the target disk and measure the time it took. Then, since we know the size of the file (1073741824 bytes, to be exact), we can calculate the speed at which the copy took place. Here's my first attempt: time dd if=/tmp/testfile.bin >testfile.bin If you run this, you might find that it doesn't take it very long at all, and you get a speed of something like ~250MiB / sec! While impressive, I seriously doubt that my flash drive has that kind of speed behind it. Typically, flash memory takes longer to write to and read from - and I'm pretty sure that it can't read from it that fast either. So what's going on? Well, it turns out that Linux is caching the disk write operations in a buffer, and then doing them in the background for us. Whilst fine for ordinary operation, this doesn't give us an accurate representation of how fast it's actually writing to the disk. Thankfully, there's something we can do about this: Use the sync command. sync will flush all cached write operations to disk for us, giving us the actual time it took to write the 1 GiB file to disk. Here's the altered command: sync; time sh -c 'dd if=/tmp/testfile.bin >testfile.bin; sync' Very cool! Now, we can just take the time it took and do some simple maths to calculate the write speed of our disk. What about the read speed though? Well, to test that, we'll first need to clear out the page cache - another one of Linux's (many) caches that holds portions of files that have recently been accessed for faster retrieval - because as before, we're not interested in the speed of the cache! Here's how to do that: echo 1 | sudo tee /proc/sys/vm/drop_caches With the correct cache cleared, we can test the read speed accurately. Here's how I did it: time dd if=testfile.bin of=/dev/null Fairly simple, right? At a later date I might figure out a way of automating this, but for the occasional use now and again this works just fine :) Found this useful? Got a better way of doing it? Want to say hi? Post in the comments below! ## Rendering LaTeX documents to PDF: Attempt #2 It was all going rather well, actually - until I discovered that pandoc doesn't support regular bibliographies / references. Upon discovering this, I ended up with a bit of problem. Thankfully, the answer lay in pdflatex - but getting to the point where I could use it without having it crash on me (which, by the way it can't accomplish properly - it gives an exit code of 0 when crashing! O.o) was not a trivial journey. This blog post is a follow up to my first post on rendering LaTeX documents with pandoc, and is my attempt to document what I did to get it to work. To start with, I installed texlive properly. Here's how to do that on apt-based systems: sudo apt install texlive-latex-extra --no-install-recommends The no-install-recommends is useful here to avoid ~450MiB of useless documentation (in PDF form, apparently) being dumped to your hard drive. I've also got an arch-based system (it's actually Artix Linux, that I've blogged about) which I've done this on, so here's the install command for those kind of systems: sudo pacman -S texlive-latexextra Once that's installed, we can use it to render our LaTeX document to PDF. Upon discussing my issues with my Lecturer at University, I discovered that you actually have to run 3 commands in succession in order to render a single PDF. Here they are: bibtex filename pdflatex --output-directory=. filename.tex pdflatex --output-directory=. filename.tex The first one compiles the bibliography using BiBTeX. If it isn't installed already, you might need to search your distribution's repositories and install it. Next, we run the LaTeX file through pdflatex from TeXLive not once but twice - as it apparently needs to resolve the references on the first pass (why it can't do them all in once pass I have no idea :P). It's also worth noting that the bibtex command doesn't like you to append the filename extension - it does it automatically, apparently. That's about everything I've got on the process so far. If you've got anything else to add, please let me know in the comments below (I'm rather new to this whole LaTeX thing....) ### Further Reading ## Rendering LaTeX documents to PDF on Linux (and maybe Windows too) I'm starting to write another report for University, and unlike other reports, this one apparently has to be a rather specific format. To that end, I've got two choices, apparently: Use the provided Word / LibreOffice template, or use a LaTeX template instead. After the trouble and frustration I had with LibreOffice for my previous report, I've naturally decided that using the LaTeX template might be a good idea. After downloading it, I ended up doing some research and troubleshooting to get it to render properly to a PDF. Now that I've figured it out, I thought I'd share it here for anyone else who ends up experiencing difficulties or is unsure on how it's done. The way I'm going to be using it is with a tool called Pandoc. First, install it like so: sudo apt install pandoc texlive-fonts-recommended Adjust as necessary for your distribution - Windows users will need to read the download instructions. The texlive-font-recommended package is ~66MiB(!), but it contains a bunch of fonts that are needed when you're rendering LaTeX documents, apparently. With the dependencies installed, here's the command to convert a LaTeX document to a PDF: pandoc -s input.tex -o output.pdf Replace index.tex with the path to your input file, and output.pdf with the desired path to the output file. I haven't figured out how to set the font to sans-serif yet, but I'll probably make another post about it when I do. Found this helpful? Still having issues? Let me know below! I don't have analytics on here, so that's the only way I'll know if anyone reads this :-) ### Sources and Further Reading ## Jump around a filesystem with a bit of bash (Banner remixed from images found on openclipart) I've seen things like jump, which allow you to bookmark places on your system so that you can return to them faster. The trouble is, I keep forgetting to use it. I open the terminal and realise that I need to be in a specific directory, and forget to bookmark it once I cd to it - or I forget that I bookmarked it and cd my way there anyway :P To solve the problem, I thought I'd try implementing my own simplified system, under the name teleport, telepeek, and telepick. Obviously, we'll have to put these scripts in something like .bash_aliases as functions - otherwise it won't cd in the terminal itself. Let's start with teleport: function teleport() { cd "$(find . -type d | grep -iP "$@" | head -n1)"; } Not bad for a first attempt! Basically, it does a find to list all the subdirectories in the current directory, filters the results with the specified regex, and changes directory to the first result returned. Here's an example of how it's used: ~$ teleport 'pep.*mint'
~/Documents/code/some/path/pepperminty-wiki/ $ We can certainly improve it though. Let's start by removing that head call: function teleport() { cd "$(find . -type d | grep -m1 -iP "$@")"; } What about all those Permission denied messages that pop up when you're jumping around places that you might not have permission to go everywhere? Let's suppress those too: function teleport() { cd "$(find . -type d 2>/dev/null | grep -m1 -iP "$@")"; } Much better. With a teleport command in hand, it might be nice to inspect the list of directories the find + grep combo finds. For that, let's invent a telepeek variant: function telepeek() { find . -type d 2>/dev/null | grep -iP "$@" | less
}

Very cool. It doesn't have line numbers though, and they're useful. Let's fix that:

function telepeek() {
find . -type d 2>/dev/null | grep -iP "$@" | less -N } Better, but I'd prefer them to be highlighted so that I can tell them apart from the directory paths. For that, we've got to change our approach to the problem: function telepeek() { find . -type d 2>/dev/null | grep -iP "$@" | cat -n | sed 's/^[ 0-9]*[0-9]/\o033[34m&\o033[0m/' | less -R
}

By using a clever combination of cat -n to add the line numbers and a strange sed recipe (which I found in a comment on this Stack Overflow answer) to highlight the numbers themselves, we can get the result we want.

This telepeek command has given me an idea. Why not ask for an index to jump to after going to the trouble of displaying line numbers and jump to that directory? Let's cook up a telepick command!

function telepick() {
telepeek $1; read -p "jump to index: " line_number; cd "$(find . -type d 2>/dev/null | grep -iP "$@" | sed "${line_number}q;d")";
}

That wasn't too hard. By using a few different commands rather like lego bricks, we can very easily create something that does what we want with minimal effort. The read -p "jump to index: " line_number bit fetches the index that the user wants to jump to, and sed comes to the rescue again to pick out the line number we're interested in with sed "${line_number}q;d". Update April 20th 2018: I've updated the approach here to support spaces everywhere by adding additional quotes, and utilising $@ instead of \$1.

Art by Mythdael