Archive

## Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

## NAS Backups, Part 1: Overview

After building my nice NAS, the next thing on my sysadmin todo list was to ensure it is backed up. In this miniseries (probably 2 posts), I'm going to talk about the backup NAS that I've built. As of the time of typing, it is sucessfully backing up my Btrfs subvolumes.

In this first post, I'm going to give an overview of the hardware I'm using and the backup system I've put together. In future posts, I'll go into detail as to how the system works, and which areas I still need to work on moving forwards.

Personally, I find that the 3-2-1 backup strategy is a good rule of thumb:

• 3 copies of the data
• 2 different media
• 1 off-site

What this means is tha you should have 3 copies of your data, with 2 local copies and one remote copy in a different geographical location. To achieve this, I have solved first the problem of the local backup copy, since it's a lot easier and less complicated than the remote one. Although I've talked about backups before (see also), in this case my solution is slightly different - partly due to the amount of data involved, and partly due to additional security measures I'm trying to get into place.

### Hardware

For hardware, I'm using a Raspberry Pi 4 with 4GB RAM (the same as the rest of my cluster), along with 2 x 4 TB USB 3 external hard drives. This is a fairly low-cost and low-performance solution. I don't particularly care how long it takes the backup to complete, and it's relatively cheap to replace if it fails without being unreliable (Raspberry Pis are tough little things).

Here's a diagram of how I've wired it up:

(Can't see the above? Try a direct link to the SVG. Created with drawio.)

I use USB Y-cables to power the hard rives directly from the USB power supply, as the Pi is unlikely to be able to supply enough power for mulitple external hard drives on it's own.

Important Note: As I've discovered with a different unrelated host on my network, if you do this you can back-power the Pi through the USB Y cable, and potentially corrupt the microSD card by doing so. Make sure you switch off the entire USB power supply at once, rather than unplug just the Pi's power cable!

For a power supply, I'm using an Anker 10 port device (though I bought through Amazon, since I wasn't aware that Anker had their own website) - the same one that powers my Pi cluster.

### Strategy

To do the backup itself I'm using the fact that I store my data in Btrfs subvolumes and the btrfs send / btrfs receive commands to send my subvolumes to the remote backup host over SSH. This has a number of benefits:

1. The host doing the backing up has no access to the resulting backups (so if it gets infected it can't also infect the backups)
2. The backups are read-only Btrfs snapshots (so if the backup NAS gets infected my backups can't be altered without first creating a read-write snapshot)
3. Backups are done incrementally to save time, but a full backup is done automatically on the first run (or if the local metadata is missing)

While my previous backup solution using Restic for the server that sent you this web page has point #3 on my list above, it doesn't have points 1 and 2.

Restic does encrypt backups at rest though, which the system I'm setting up doesn't do unless you use LUKS to encrypt the underlying disks that Btrfs stores it's data on. More on that in the future, as I have tentative plans to deal with my off-site backup problem using a similar technique to that which I've used here that also encrypts data at rest when a backup isn't taking place.

In the next post, I'll be diving into the implementation details for the backup system I've created and explaining it in more detail - including sharing the pair of scripts that I've developed that do the heavy lifting.

## Installing Python, Keras, and Tensorflow from source

I found myself in the interesting position recently of needing to compile Python from source. The reasoning behind this is complicated, but it boils down to a need to use Python with Tensorflow / Keras for some natural language processing AI, as Tensorflow.js isn't going to cut it for the next stage of my PhD.

The target upon which I'm aiming to be running things currently is Viper, my University's high-performance computer (HPC). Unfortunately, the version of Python on said HPC is rather old, which necessitated obtaining a later version. Since I obviously don't have sudo permissions on Viper, I couldn't use the default system package manager. Incredibly, pre-compiled Python binaries are not distributed for Linux either, which meant that I ended up compiling from source.

I am going to be assuming that you have a directory at $HOME/software in which we will be working. In there, there should be a number of subdirectories: • bin: For binaries, already added to your PATH • lib: For library files - we'll be configuring this correctly in this guide • repos: For git repositories we clone Make sure you have your snacks - this was a long ride to figure out and write - and it's an equally long ride to follow. I recommend reading this all the way through before actually executing anything to get an overall idea as to the process you'll be following and the assumptions I've made to keep this post a reasonable length. ### Setting up Before we begin, we need some dependencies: • gcc - The compiler • git - For checking out the cpython git repository • readline - An optional dependency of cpython (presumably for the REPL) On Viper, we can load these like so: module load utilities/multi module load gcc/10.2.0 module load readline/7.0 ### Compiling openssl We also need to clone the openssl git repo and build it from source: cd ~/software/repos git clone git://git.openssl.org/openssl.git; # Clone the git repo cd openssl; # cd into it git checkout OpenSSL_1_1_1-stable; # Checkout the latest stable branch (do git branch -a to list all branches; Python will complain at you during build if you choose the wrong one and tell you what versions it supports) ./config; # Configure openssl ready for compilation make -j "$(nproc)"                              # Build openssl

With openssl compiled, we need to copy the resulting binaries to our ~/software/lib directory:

cp lib*.so* ~/software/lib;
# We're done, cd back to the parent directory
cd ..;

To finish up openssl, we need to update some environment variables to let the C++ compiler and linker know about it, but we'll talk about those after dealing with another dependency that Python requires.

### Compiling libffi

libffi is another dependency of Python that's needed if you want to use Tensorflow. To start, go to the libgffi GitHub releases page in your web browser, and copy the URL for the latest release file. It should look something like this:

cd ~/software/lib
curl -OL URL_HERE

Note that we do it this way, because otherwise we'd have to run the autogen.sh script which requires yet more dependencies that you're unlikely to have installed.

Then extract it and delete the tar.gz file:

tar -xzf libffi-3.3.tar.gz
rm libffi-3.3.tar.gz

Now, we can configure and compile it:

./configure --prefix=$HOME/software make -j "$(nproc)"

Before we install it, we need to create a quick alias:

cd ~/software;
ln -s lib lib64;
cd -;

libffi for some reason likes to install to the lib64 directory, rather than our pre-existing lib directory, so creating an alias makes it so that it installs to the right place.

### Updating the environment

Now that we've dealt with the dependencies, we now need to update our environment so that the compiler knows where to find them. Do that like so:

export LD_LIBRARY_PATH="$HOME/software/lib:${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"; export LDFLAGS="-L$HOME/software/lib -L$HOME/software/include$LDFLAGS";
export CPPFLAGS="-I$HOME/software/include -I$HOME/software/repos/openssl/include -I$HOME/software/repos/openssl/include/openssl$CPPFLAGS"

It is also advisable to update your ~/.bashrc with these settings, as you may need to come back and recompile a different version of Python in the future.

Personally, I have a file at ~/software/setup.sh which I run with source $HOME/software/setuop.sh in my ~/.bashrc file to keep things neat and tidy. ### Compiling Python Now that we have openssl and libffi compiled, we can turn our attention to Python. First, clone the cpython git repo: git clone https://github.com/python/cpython.git cd cpython; Then, checkout the latest tag. This essentially checks out the latest stable release: git checkout "$(git tag | grep -ivP '[ab]|rc' | tail -n1)"

Important: If you're intention is to use tensorflow, check the Tensorflow Install page for supported Python versions. It's probable that it doesn't yet support the latest version of Python, so you might need to checkout a different tag here. For some reason, Python is really bad at propagating new versions out to the community quickly.

Before we can start the compilation process, we need to configure it. We're going for performance, so execute the configure script like so:

./configure --with-lto --enable-optimizations --with-openssl=/absolute/path/to/openssl_repo_dir

Replace /absolute/path/to/openssl_repo with the absolute path to the above openssl repo.

Now, we're ready to compile Python. Do that like so:

make -j "$(nproc)" This will take a while, but once it's done it should have built Python successfully. For a sanity check, we can also test it like so: make -j "$(nproc)" test

The Python binary compiled should be called simply python, and be located in the root of the git repository. Now that we've compiled it, we need to make a few tweaks to ensure that our shell uses our newly compiled version by default and not the older version from the host system. Personally, I keep my ~/bin folder under version control, so I install host-specific to ~/software, and put ~/software/bin in my PATH like so:

export PATH=$HOME/software/bin With this in mind, we need to create some symbolic links in ~/software/bin that point to our new Python installation: cd$HOME/software/bin;
ln -s relative/path/to/python_binary python
ln -s relative/path/to/python_binary python3
ln -s relative/path/to/python_binary python3.9

Replace relative/path/to/python_binary with the relative path tot he Python binary we compiled above.

To finish up the Python installation, we need to get pip up and running, the Python package manager. We can do this using the inbuilt ensurepip module, which can bootstrap a pip installation for us:

python -m ensurepip --user

This bootstraps pip into our local user directory. This is probably what you want, since if you try and install directly the shebang incorrectly points to the system's version of Python, which doesn't exist.

Then, update your ~/.bash_aliases and add the following:

export LD_LIBRARY_PATH=/absolute/path/to/openssl_repo_dir/lib:$LD_LIBRARY_PATH; alias pip='python -m pip' alias pip3='python -m pip' ...replacing /absolute/path/to/openssl_repo_dir with the path to the openssl git repo we cloned earlier. The next stage is to use virtualenv to locally install our Python packages that we want to use for our project. This is good practice, because it keeps our dependencies locally installed to a single project, so they don't clash with different versions in other projects. Before we can use virtualenv though, we have to install it: pip install virtualenv Unfortunately, Python / pip is not very clever at detecting the actual Python installation location, so in order to actually use virtualenv, we have to use a wrapper script - because the [shebang]() in the main ~/.local/bin/virtualenv entrypoint does not use /usr/bin/env to auto-detect the python binary location. Save the following to ~/software/bin (or any other location that's in your PATH ahead of ~/.local/bin): #!/usr/bin/env bash exec python ~/.local/bin/virtualenv "$@"

For example:

# Write the script to disk
nano ~/software/bin/virtualenv;
# chmod it to make it executable
chmod +x ~/software/bin/virtualenv

### Installing Keras and tensorflow-gpu

With all that out of the way, we can finally use virtualenv to install Keras and tensorflow-gpu. Let's create a new directory and create a virtual environment to install our packages in:

mkdir tensorflow-test
cd tensorflow-test;
virtualenv "$PWD"; source bin/activate; Now, we can install Tensorflow & Keras: pip install tensorflow-gpu It's worth noting here that Keras is a dependency of Tensorflow. Tensorflow has a number of alternate package names you might want to install instead depending on your situation: • tensorflow: Stable tensorflow without GPU support - i.e. it runs on the CPU instead. • tf-nightly-gpu: Nightly tensorflow for the GPU. Useful if your version of Python is newer than the version of Python supported by Tensorflow Once you're done in the virtual environment, exit it like this: deactivate Phew, that was a huge amount of work! Hopefully this sheds some light on the maddenly complicated process of compiling Python from source. If you run into issues, you're welcome to comment below and I'll try to help you out - but you might be better off asking the Python community instead, as they've likely got more experience with Python than I have. ### Sources and further reading ## applause-cli: A Node.js CLI handling library Continuing in the theme of things I've forgotten to talk about, I'd like to post about another package I've released a little while ago. I've been building a number of command line interfaces for my PhD, so I thought it would be best to use a library for this function. I found [clap](), but it didn't quite do what I wanted - so I wrote my own inspired by it. Soon enough I needed to use the code in several different projects, so I abstracted the logic for it out and called it applause-cli, which you can now find on npm. It has no dependencies, and it allows you do define a set of arguments and have it parsed out the values from a given input array of items automatically. Here's an example of how it works: import Program from 'applause-cli'; let program = new Program("path/to/package.json"); program.argument("food", "Specifies the food to find.", "apple") .argument("count", "The number of items to find", 1, "number"); program.parse(process.argv.slice(2)); // Might return { food: "banana", count: 6 } I even have automated documentation generated with documentation and uploaded to my website via Continuous Integration: https://starbeamrainbowlabs.com/code/applause-cli/. I've worked pretty hard on the documentation for this library actually - it even has integrated examples to show you how to use each function! The library can also automatically generate help output from the provided information when the --help argument is detected too - though I have yet to improve the output if a subcommand is called (e.g. mycommand dostuff --help) - this is on my todo list :-) Here's an example of the help text it automatically generates: If this looks like something you'd be interested in using, I recommend checking out the npm package here: https://www.npmjs.com/package/applause-cli For the curious, applause-cli is open-source under the MPL-2.0 licence. Find the code here: https://github.com/sbrl/applause-cli. ## NAS, Part 4: Time machines | Automatic snapshotting with btrfs-snapshot In the last part in this series, I compared ZFS with Btrfs. I ended up choosing Btrfs because it was easier to install and came with a number of advantages. Since last time, I've now put Btrfs to work and have about ~1.3 TiB of data stored in it (much of which is from various devices across the network automatically backing up to it). Before we continue, here's a list of the parts in the series so far: In this post, I'm going to talk about the automatic snapshotting I've setup. Btrfs supports creating snapshots, which are defined as subvolumes that are seeded with data from another subvolume (boundaries between subvolumes are not crossed). Most of the time, these are created to be read-only. In addition because of the copy-on-write system Btrfs uses, a snapshot takes no disk space on its own (other than that required to store the fact that it exists) - it only starts to consume disk space when files that it contains are modified in the original subvolume. To this end, we can efficiently keep a rotating series of snapshots to serve as an initial safety net should a someone accidentally delete a file. Of course, we can't assume that snapshots will be ok as the only backup (I use Restic for that - I'm in the process of reconfiguring it for my new setup) - but they are still useful things to have. To take a Btrfs snapshot, you can do this: sudo btrfs subvolume snapshot -r path/to/source_subvolume path/to/target The problem here, of course, is that you also need a way to delete old snapshots too. While I could roll my own solution for this, I figured that someone has already solved this problem - so it might save me some effort if I look for a pre-existing solution first. After doing a bit of searching without success, I asked on Reddit, and the helpful folks there gave me a number of suggestions: Of these 3, snapper seemed to be the most popular. From some reading, it appeared to be powerful and flexible - at the cost of being easy to understand. btrbk seemed to be feature-packed too, but in the end I decided on btrfs-snapshot. btrfs-snapshot is designed to be used with cron. For example, I have something like this for one of my subvolumes in root user's crontab: 0 * * * * /root/btrfs-snapshot-rotation/btrfs-snapshot path/to/subvolume path/to/subvolume/.snapshots hourly 8 0 2 * * * /root/btrfs-snapshot-rotation/btrfs-snapshot path/to/subvolume path/to/subvolume/.snapshots daily 4 0 2 * * 7 /root/btrfs-snapshot-rotation/btrfs-snapshot path/to/subvolume path/to/subvolume/.snapshots weekly 4 Given a subvolume at path/to/subvolume, it creates the following snapshots in a nested subvolume in path/to/subvolume/.snapshots (which needs to be created manually: sudo btrfs subvolume create path/to/subvolume/.snapshots): • 8 x hourly snapshots • 4 x daily snapshots • 4 x weekly snapshots I find the system so beautifully simple and easy to understand. This is important for me in a system like this, as it has to be easy for me to understand when I inevitably come back to it months or even years later when I've forgotten how it works. The arguments to btrfs-snapshot are easy to understand, and are in the form path/to/source path/to/target tag_name number_of_snapshots_to_keep. This has the added bonus that if a user deletes a file accidentally in our shared drive, they can retrieve it on their own from the .snapshots directory - without my intervention. With this in place and the data (mostly) moved over, my NAS project is almost complete. The final task I have left to do is to setup a proper backup system with Restic to either a remote (e.g. Backblaze B2) or offline location (such as an external HDD). The latter might prove to be a problem though, since the maximum amount of data I can store right now is 5.5 TiB and is only going to grow from there. Portable external hard drives I've seen online don't appear to go up that high, so I suspect I'll need to choose another plan. Should I encounter some interesting issues when setting this final backup step up, I'll make an additional post in this series. If not though, this will probably be the last entry in this series. If you have any questions about my setup, please comment below! I'll dod my best to answer any questions. ## Digitising old audio CDs on a Linux Server A number of people I know own a number of audio / music CDs. This is great, but unfortunately increasingly laptops aren't coming with an optical drive any more, which makes listening to said CDs challenging. To this end, making a digital copy to add to their personal digital music collections would be an ideal solution. Recently, I build a new storage NAS (which I'm still in the process of deciding on a filesystem for, but I think I might be going with btrfs + raid1), and the Fractal Design Node 804 case I used has a dedicated space for a slimline DVD writer (e.g. like the one you might find in a car). I've found this to be rather convenient for making digital copies of old audio CDs, and wanted to share the process by which I do it in case you'd like to do it too. To start, I'm using Ubuntu Server 20.04. This may work on other distributions too, but there are a whole bunch of packages you'll need to install - the names and commands for which you may need to convert for your distribution. To make the digital copies, we'll be using abcde. I can't find an updated website for it, but it stands for "A Better CD Encoder". It neatly automates much of the manual labor of digitising CDs - including the downloading of metadata from the Internet. To tidy things up after abcde has run to completion, we'll be using ffmpeg for conversion and eyeD3 for mp3 metadata manipulation. To get started, let's install some stuff! sudo apt install --no-install-recommends abcde sudo apt install ffmpeg mkcue eyed3 flac glyrc cdparanoia imagemagick Lots of dependencies here. Many of them are required by abcde for various features we'll be making use of. Next, insert the audio CD into the DVD drive. abcde assumes your DVD drive is located at /dev/sr0 I think, so if it's different you'll have to adjust the flags you pass to it. Once done, we can call abcde and get it to make a digital copy of our CD. I recommend here that you cd to a new blank directory, as abcde creates 1 subdirectory of the current working directory for each album it copies. When you're ready, start abcde: abcde -o flac -B -b Here, we call abcde and ask it to save the digital copy as flac files. The reason we do this and not mp3 directly is that I've observed abcde gets rather confused with the metadata that way. By saving to flac files first, we can ensure the metadata is saved correctly. The arguments above do the following: • -o flac: Save to flac files • -B: Automatically embed the album art into the saved music files if possible • -b: Preserve the relative volume differences between tracks in the album (if replaygain is enabled, which by default I don't think it is) It will ask you a number of questions interactively. Once you've answered them, it will get to work copying the audio from the CD. When it's done, everything should be good to go! However flac files can be large, so something more manageable is usually desired. For this, we can mass-convert our flac files to MP3. This can be done like so: find -iname '*.flac' -type f -print0 | nice -n20 xargs -P "$(nproc)" --null --verbose -n1 -I{} sh -c 'old="{}"; new="${old%.*}.mp3"; ffmpeg -i "${old}" -ab 320k -map_metadata 0 -id3v2_version 3 "${new}";'; There's a lot to unpack here! Before I do though, let's turn it into a bash function real quick which we can put in ~/.bash_aliases for example to make it easy to invoke in the future: # Usage: # flac2mp3 # flac2mp3 path/to/directory flac2mp3() { dir="${1}";
if [[ -z "${dir}" ]]; then dir="."; fi find "${dir}" -iname '*.flac' -type f -print0 | nice -n20 xargs -P "$(nproc)" --null --verbose -n1 -I{} sh -c 'old="{}"; new="${old%.*}.mp3"; ffmpeg -i "${old}" -ab 320k -map_metadata 0 -id3v2_version 3 "${new}";';
}

Ah, that's better. Now, let's deconstruct it and figure out how it works. First, we have a dir variable which, by default, is set to the current working directory.

Next, we use the one-liner from before to mass-convert all flac files in the target directory recursively to mp3. It's perhaps easier to digest if we separate it out int multiple lines:

find "${dir}" -iname '*.flac' -type f -print0 # Recursively find all flac files, delimiting them with NULL (\0) characters | nice -n20 # Push the task into the background xargs # for each line of input, execute a command --null # Lines are delimited by NULL (\0) characters --verbose # Print the command that is about to be executed -P "$(nproc)" # Parallelise across as many cores as the machine has
-n1 # Only pass 1 line to the command to be executed
-I{} # Replace {} with the filename in question
sh -c ' # Run this command
old="{}"; # The flac filename
new="${old%.*}.mp3"; # Replace the .flac file extension with .mp3 ffmpeg # Call ffmpeg to convert it to mp3 -i "${old}" # Input the flac file
-ab 320k # Encode to 320kbps, the max supported by ffmpeg
-id3v2_version 3 # Set the metadata tags version (may not be necessary)
-c:v copy -disposition:v:0 attached_pic # Copy the album art if it exists
"${new}"; # Output to mp3 '; # End of command to be executed Obviously it won't actually work when exploded and commented like this, but hopefully it gives a sense of how it functions. I recommend checking that the album art has been transferred over. The -c:v copy -disposition:v:0 attached_pic bit in particular is required to ensure this happens (see this Unix Stack Exchange answer to a question I asked). Sometimes abcde is unable to locate album art too, so you may need to find and download it yourself. If so, then this one-liner may come in handy: find , -type f -iname '*.mp3' -print0 | xargs -0 -P "$(nproc)" eyeD3 --add-image "path/to/album_art.jpeg:FRONT_COVER:";

Replace path/to/album_art.jpeg with the path to the album art. Wrapping it in a bash function ready for ~/.bash_aliases makes it easier to use:

mp3cover() {
cover="${1}"; dir="${2}";

if [[ -z "${cover}" ]] || [[ -z "${dir}" ]]; then
echo "Usage:" >&2;
echo "    mp3cover path/to/cover_image.jpg path/to/album_dir";
return 0;
fi

find "${dir}" -type f -iname '*.mp3' -print0 | xargs -0 -P "$(nproc)" eyeD3 --add-image "${cover}:FRONT_COVER:" } Use it like this: mp3cover path/to/cover_image.jpg path/to/album_dir By this point, you should have successfully managed to make a digital copy of an audio CD. If you're experiencing issues, comment below and I'll try to help out. Note that if you experience any issues with copy protection (I think this is only DVDs / films and not audio CDs, which I don't intend to investigate), I can't and won't help you, because it's there for a reason (even if I don't like it) and it's illegal to remove it - so please don't comment in this specific case. ## Resizing Encrypted LVM Partitions on Linux I found recently that I needed to resize some partitions on my new laptop as the Ubuntu installer helpfully decided to create only a 1GB swap partition, which is nowhere near enough for hibernation (you need a swap partition that's at least as big as your computer's RAM in order to hibernate). Unfortunately resizing my swap partition didn't allow me to hibernate successfully in the end, but I thought I'd still document the process here for future reference should I need to do it again in the future. The key problem with resizing one's root partition is that you can't resize it without unmounting it, and you can't unmount it without turning off your computer. To get around this, we need to use a live distribution of Ubuntu. It doesn't actually matter how you boot into this - personally my preferred method is by using a multiboot USB flash drive, but you could just as well flash the latest ubuntu ISO to a flash drive directly. Before you start though, it's worth mentioning that you really should have a solid backup strategy. While everything will probably be fine, there is a chance that you'll make a mistake and wind up loosing a lot of data. My favourite website that illustrates this is The Tao of Backup. Everyone who uses a computer (technically minded or not) should read it. Another way to remember it is the 3-2-1 rule: 3 backups, in 2 locations, with 1 off-site (i.e. in a different physical location). Anyway, once you've booted into a live Ubuntu environment, open the terminal, and start a root shell. Your live distribution should come with LUKS and LVM already, but just in case it doesn't execute the following: sudo apt update && sudo apt install -y lvm2 cryptsetup I've talked about LVM recently when I was setting up an LVM-managed partition on an extra data hard drive for my research data. If you've read that post, then the process here may feel a little familiar to you. In this case, we're interacting with a pre-existing LVM setup that's encrypted with LUKS instead of setting up a new one. The overall process look a bit like this: With this in mind, let's get started. The first order of business is unlocking the LUKS encryption on the drive. This is done like so: sudo modprobe dm-crypt sudo cryptsetup luksOpen /dev/nvme0n1p3 crypt1 The first command there ensures that the LUKS kernel module is loaded if it isn't already, and the second unlocks the LUKS-encrypted drive. Replace /dev/nvme0n1p3 with the path to your LVM partition - e.g. /dev/sda1 for instance. The second command will prompt you for the password to unlock the drive. It's worth mentioning here before continuing the difference between physical partitions and LVM partitions. Physical partitions are those found in the partition table on the physical disk itself, that you may find in a partition manage like GParted. LVM partitions - for the purpose of this blog post - are those exposed by LVM. They are virtual partitions that don't have a physical counterpart on disk and are handled internally by LVM. As far as I know, you can't ask LVM easily where it stores them on disk - this is calculated and managed automatically for you. In order to access our logical LVM partitions, the next step is to bring up LVM. To do this, we need to get LVM to re-scan the available physical partitions since we've just unlocked the one we want it to use: sudo vgscan --mknodes Then, we activate it: sudo vgchange -ay At this point, we can now do our maintenance and make any changes we need to. A good command to remember here is lvdisplay, which lists all the available LVM partitions and their paths: sudo lvdisplay In my case, I have /dev/vgubuntu/root and /dev/vgubuntu/swap_1. tldr-pages (for which I'm a maintainer) has a number of great LVM-related pages that were contributed relatively recently which are really helpful here. For example, to resize a logical LVM partition to be a specific size, do something like this: sudo lvresize -L 32G /dev/vgubuntu/root To extend a partition to fill all the remaining available free space, do something like this: sudo lvextend -l +100%FREE /dev/vgubuntu/root After resizing a partition, don't forget to run resize2fs. It ensures that the ext4 filesystem on top matches the same size as the logical LVM partition: sudo resize2fs /dev/vgubuntu/root In all of the above, replace /dev/vgubuntu/root with the path to your logical LVM partition in question of course. Once you're done making changes, we need to stop LVM and close the LUKS encrypted disk to ensure all the changes are saved properly and to avoid any issues. This is done like so: sudo vgchange -an sudo cryptsetup luksClose crypt1 With that, you're done! You can now reboot / shutdown from inside the live Ubuntu environment and boot back into your main operating system. All done! Found this helpful? Encountering issues? Comment below! It really helps my motivation. ## Run a Program on your dedicated Nvidia graphics card on Linux I've got a new laptop, and in it I have an Nvidia graphics card. The initial operating system installation went ok (Ubuntu 20.10 Groovy Gorilla is my Linux distribution of choice here), but after I'd done a bunch of configuration and initial setup tasks it inevitably came around to the point in time that I wanted to run an application on my dedicated Nvidia graphics card. Doing so is actually really quite easy - it's figuring out the how that was the hard bit! To that end, this is a quick post to document how to do this so I don't forget next time. Before you continue, I'll assume here that you're using the Nvidia propriety drivers. If not, consult your distribution's documentation on how to install and enable this. With Nvidia cards, it depends greatly on what you want to run as to how you go about doing this. For CUDA applications, you need to install the nvidia-cuda-toolkit package: sudo apt install nvidia-cuda-toolkit ...and then there will be an application-specific menu you'll need to navigate to tell it which device to use. For traditional graphical programs, the process is different. In the NVIDIA X Server Settings, go to PRIME Profiles, and then ensure NVIDIA On-Demand mode is selected. If not, select it and then reboot. Then, to launch a program on your Nvidia dedicated graphics card, do this: __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia command_name arguments In short, the __NV_PRIME_RENDER_OFFLOAD environment variable must be set to 1 , and the __GLX_VENDOR_LIBRARY_NAME environment variable must be set to nvidia. If you forget the latter here or try using DRI_PRIME=1 instead of both of these (as I advise in my previous post about AMD dedicated graphics cards), you'll get an error like this: libGL error: failed to create dri screen libGL error: failed to load driver: nouveau The solution is to set the above environment variables instead of DRI_PRIME=1. Thanks to this Ask Ubuntu post for the answer! Given that I know I'll be wanting to use this regularly, I've created a script in my bin folder for it. It looks like this: #!/usr/bin/env bash export __NV_PRIME_RENDER_OFFLOAD=1; export __GLX_VENDOR_LIBRARY_NAME=nvidia;$@;

Save the above to somewhere in your PATH (e.g. your bin folder, if you have one). Don't forget to run chmod +x path/to/file to mark it executable.

Found this useful? Does this work on other systems and setups too? Comment below!

## Stitching videos from frames with ffmpeg (and audio/video editing tricks)

ffmpeg is awesome. In case you haven't heard, ffmpeg is a command-line tool for editing and converting audio and video files. It's taken me a while to warm up to it (the command-line syntax is pretty complicated), but since I've found myself returning to it again and again I thought I'd blog about it so you can use it too (and also for my own reference :P).

I have 2 common tasks I perform with ffmpeg:

1. Converting audio/video files to a more efficient format
2. Stitching PNG images into a video file

I've found that most tasks will fall into 1 of the 2 boxes. Sometimes I want to do something more complicated, but I'll usually just look that up and combine the flags I find there with the ones I usually use for one of the 2 above tasks.

## Stitching PNG images into a video

I don't often render an animation, but when I do I always render to individual images - that way if it crashes half way through I can more easily restart where I left off (and also divide the workload more easily across multiple machines if I need it done quickly).

Individual image files are all very well, but they have 2 problems:

1. They take up lots of space
2. You can't easily watch them like a video

Solving #1 is relatively easy with optipng (sudo apt install optipng), for example:

find . -iname "*.png" -print0 | xargs -0 -P4 -n1 optipng -preserve

I cooked that one up a while ago and I've had it saved in my favourites for ages. It finds all png images in the current directory (recursively) and optimises them with optipng.

To resolve #2, we can use ffmpeg as I mentioned earlier in this post. Here's one I use:

ffmpeg -r 24 -i path/to/frame_%04d.png -c:v libvpx-vp9 -b:v 2000k -crf 31 path/to/output.webm

A few points on this one:

• -r 24: This sets the frames per second.
• -i path/to/frame_%04d.png: The path to the png images to stitch. %04d refers to 4 successive digits in a row that count up from 1 - e.g. 0001, 0002, etc. %d would mean 1, 2, ...., and %02d for example would mean 01, 02, 03....
• -c:v libvpx-vp9: The codec. We're exporting to webm, and vp9 is the latest available codec for webm video files.
• -b:v 2000k: The bitrate. Higher values mean more bits will be used per second of video to encode detail. Raise to increase quality.
• -crf 31: The encoding quality. Should be anumber betwee 0 and 63 - higher means lower quality and a lower filesize.

## Converting audio / video files

I've talked about downmuxing audio before, so in this post I'm going to be focusing on video files instead. The above command for encoding to webm can be used with minimal adjustment to transcode videos from 1 format to another:

ffmpeg -hide_banner -i "path/to/input.avi" -c:v libvpx-vp9 -c:a libopus -crf 31 -b:v 0 "path/to/output/webm";

In this one, we handle the audio as well as the video all in 1 go, encoding the audio to use the opus codec, and the video as before. This is particularly useful if you've got a bunch of old video files generated by an old camera. I've found that often old cameras like to save videos as raw uncompressed AVI files, and that I can reclaim a sigificant amount of disk space if I transcode them to something more efficient.

Naturally, I cooked up a one-liner that finds all relevant video files recursively in a given directory and transcodes them to 1 standard efficient format:

find . -iname "*.AVI" -print0 | nice -n20 xargs --verbose -0 -n1 -I{} sh -c 'old="{}"; new="${old%.*}.webm"; ffmpeg -hide_banner -i "${old}" -c:v libvpx-vp9 -c:a libopus -crf 30 -b:v 0 "${new}" && rm "${old}"';

Let's break this down. First, let's look at the commands in play:

• find: Locates the target files to transcode
• nice: Runs the following command in the background
• xargs: Runs the following command over and over again for every line of input
• sh: A shell, to execute multiple command in sequence
• ffmpeg: Does the transcoding
• rm: Deletes the old file if transcoding completes successfully

The old="{}"; new="${old%.*}.webm"; bit is a bit of gymnastics to pull in the path to the target file (the -I {} tells xargs where to substitute it in) and determine the new filepath by replacing the file extension with .webm. I've found that this does the job quite nicely, and I can set it running and go off to do something else while it happily sits there and transcodes all my old videos, reclaiming lots of disk space in the process. Found this useful? Got a cool command for processing audio/video/image files in bulk? Comment below! ## Partitioning and mounting a new disk using LVM As I've been doing my PhD, I've been acquiring quite a lot of data that needs storing. To this end, I have acquired a new 2 TiB hard drive in my Lab PC. Naturally, this necessitates formatting it so that I can use it. Since I've been using LVM (Logical Volume Management) for my OS disk - so I decided to use it for my new disk too. Unfortunately, I don't currently have GUI access to my Lab PC - instead for the past few months I've been limited to SSH access (which is still much better than nothing at all), so I can't really use any GUI tool to do this for me. This provided me with a perfect opportunity to get into LVM through the terminal instead. As it turns out, it's not actually that bad. In this post, I'm going to take you through the process of formatting a fresh disk: from creating a partition table to mounting the LVM logical volume. Let's start by partitioning the disk. For this, we'll use the fdisk CLI tool (install it on Debian-based systems with sudo apt install fdisk if it's not available already). It should be obvious, but for this tutorial root access is required to execute pretty much all the commands we'll be using. Start fdisk like so: sudo fdisk /dev/sdX Replace X with the index of your disk (try lsblk - no sudo required - to disk your disks). fdisk works a bit like a shell. You enter letters (or short sequences) followed by hitting enter to give it commands. Enter the following sequence of commands: • g: Create new GPT partition table • n: Create new partition (allow it to fill the disk) • t: Change partition type • L: List all known partition types • 31: Change to a Linux LVM partition • p: Preview final partition setup • w: Write changes to disk and exit Some commands need additional information - fdisk will prompt you here. With our disk partitioned, we now need to get LVM organised. LVM has 3 different key concepts: • Physical Volumes: The physical disk partitions it should use as a storage area (e.g. we created an appropriate partition type above) • Volume Groups: Groups of 1 or more Physical Volumes • Logical Volumes: The volumes that you use, format, and mount - they are stored in Volume Groups. To go with these, there are also 3 different classes of commands that LVM exposes: pv* commands for Physical Volumes, vg* for Volume Groups, and lv* for Logical Volumes. With respect to Physical Volumes, these are physical partitions on disk. For legacy MSDOS partition tables, these must have a partition type of 8e. For newer GPT partition tables (such as the one we created above), these need the partition id 31 (Linux LVM) - as described above. Next, we create a new volume group that holds our physical volume: sudo vgcreate vg-pool-name /dev/sdXY ....replacing /dev/sdXY with the partition you want to add (again, lsblk is helpful here). Don't forget to change the name of the Volume Group to something more descriptive than vg-pool-name - though keeping the vg prefix is recommended. If you like, you can display the current Volume Groups with the following command: sudo vgdisplay Then, create a new logical volume that uses all of the space in the new volume group: sudo lvcreate -l 100%FREE -n lv-rocket-blueprints vg-pool-name Again, replace vg-pool-name with the name of your Volume Group, and lv-rocket-blueprints with the desired name of your new logical volume. tldr (for which I review pull requests) has a nice page on lvcreate. If you have a tldr client installed, simply do this to see it: tldr lvcreate With our logical volume created, we can now format it. I'm going to format it as ext4, but you can format it as anything you like. sudo mkfs.ext4 /dev/vg-pool-name/lv-rocket-blueprints As before, replace vg-pool-name and lv-rocket-blueprints with your Volume Group and Logical Volume names respectively. Finally, mount the newly formatted partition: sudo mkdir /mnt/rocket-blueprints sudo mount /dev/vg-extra-data/lv-rocket-blueprints /mnt/rocket-blueprints You can mount it anywhere - though I'd recommend mounting it to somewhere in /mnt. ### Auto-mounting LVM logical volumes A common thing many (myself included) want to do is automatically mount an LVM partition on boot. This is fairly trivial to do with /etc/fstab. To start, find the logical volume's id with the following command: sudo blkid It should be present as UUID="THE_UUID_HERE". Pick out only the UUID of the logical volume you want to automount here. As a side note, using the UUID is generally a better idea than the name, because the name of the partition (whether it's an LVM partition or a physical /dev/sdXY partition) might change, while the UUID always stays the same. Before continuing, ensure that the partition is unmounted: sudo umount /mnt/rocket-blueprints Now, edit /etc/fstab (e.g. with sudo nano /etc/fstab), and append something like this following to the bottom of the file: UUID=THE_UUID_YOU_FOUND_HERE /mnt/rocket-blueprints ext4 defaults,noauto 0 2 Replace THE_UUID_YOU_FOUND_HERE with the UUID you located with blkid, and /mnt/rocket-blueprints with the path to where you want to mount it to. If an empty directory doesn't already exist at the target mount point, don't forget to create it (e.g. with sudo mkdir /mnt/rocket-blueprints). Save and close /etc/fstab, and then try mounting the partition using the /etc/fstab definition: sudo mount /mnt/rocket-blueprints If it works, edit /etc/fstab again and replace noauto with auto to automatically mount it on boot. That's everything you need to know to get up and running with LVM. I've included my sources below - in particular check out the howtogeek.com tutorial, as it's not only very detailed, but it also has a cheat sheet containing most of the different LVM commands that are available. Found this useful? Still having issues? Got a suggestion? Comment below! ### Sources and further reading ## Making an auto-updated downmuxed copy of my music I like to buy and own music. That way, if the service goes down, I still get to keep both my music and the rights thereto that I've paid for. To this end, I maintain an offline collection of music tracks that I've purchased digitally. Recently, it's been growing quite large (~15GiB at the moment) - which is quite a bit of disk space. While this doesn't matter too much on my laptop, on my phone it's quite a different story. For this reason, I wanted to keep a downmuxed copy of my music collection on my Raspberry Pi 3B+ file server that I can sync to my phone. Said Raspberry Pi already has ffmpeg installed, so I decided to write a script to automate the process. In this blog post, I'm going to walk you through the script itself and what it does - and how you can use it too. I've decided on a standard downmuxed format of 256kbps MP3. You can choose anything you like - you just need to tweak the appropriate lines in the script. First, let's outline what we want it to do: 1. Convert anything that isn't an mp3 to 256kbps mp3 (e.g. ogg, flac) 2. Downmux mp3 files that are at a bitrate higher than 256kbps 3. Leave mp3s that are at a bitrate lower than (or equal to) 256kbps alone 4. Convert and optimise album art to 256x256 5. Copy any unknown files as-is 6. If the file exists in the target directory already, don't re-convert it again 7. Max out the system resources when downmuxing to get it done as fast as possible With this in mind, let's start outlining a script: #!/usr/bin/env bash input="${DIR_INPUT:-/absolute/path/to/Music}"
output="${DIR_OUTPUT:-/absolute/path/to/Music-Portable}" export input; export output; temp_dir="$(mktemp --tmpdir -d "portable-music-copy-XXXXXXX")";

on_exit() {
rm -rf "${temp_dir}"; } trap on_exit EXIT; # Library functions go here #$1    filename
process_file() {
filename="${1}"; extension="${filename##*.}";

# Process file here
}

export temp_dir;
export -f process_file;

cd "${input}" || { echo "Error: Failed to cd to input directory"; exit 1; }; find -type f -print0 | nice -n20 xargs -P "$(nproc)" -0 -n1 -I{} bash -c 'process_file "{}"';

# Cleanup here....

Very cool. At the top of the script, we define the input and output directories we're going to work on. We use the ${VARIABLE_NAME:-default_value} syntax to allow for changing the input and output directories on the fly with the DIR_INPUT and DIR_OUTPUT environment variables. Next, we create a temporary directory, and define an exit trap to ensure it gets deleted when the script exits (regardless of whether the exit is clean or not). Then, we define the main driver function that will process a single file. This is called by xargs a little further down - which takes the file list in from a find call. The cd is important there, because we want the file paths from find to be relative to the input directory for easier mangling later. The actual process_file call is wrapped in bash -c '', because being a bash function it can't be called by xargs directly - so we have to export -f it and wrap it as shown. Next, we need to write some functions to handle converting different file types. First, let's write a simple copy function: #$1    Source
# $2 Target do_copy() { source="${1}";
target="${2}"; echo -n "cp "; cp "${source}" "${target}"; } All it does is call cp, but it's nice to abstract like this so that if we wanted to add extra features (e.g. uploading via sftp or something) later, it's not as much of a bother. We also need to downmux audio files and convert them to mp3. Let's write a function for that too:  #$1    Source
# $2 Target do_downmux() { source="${1}";
target="${2}"; set +e; ffmpeg -hide_banner -loglevel warning -nostats -i "${source}" -vn -ar 44100 -b:a 256k -f mp3 "${target}"; exit_code="${?}";
if [[ "${exit_code}" -ne 0 ]] && [[ -f "${target}" ]]; then
rm "${target}"; fi return "${exit_code}";
}


It's got the same arguments signature as do_copy, but it downmuxes instead of copying directly. The line that does the magic is highlighted. It looks complicated, but it's actually pretty logical. Let's break down all those arguments:

Argument Purpose
-hide_banner Hides the really rather wordy banner at the top when ffmpeg starts up
-loglevel warning Hides everything but warning messages to avoid too much unreadable output when converting many tracks at once
-nostats As above
-i "${source}" Specifies the input file -vn Strips any video tracks found -ar 44100 Force the sampling rate to 44.1KHz, just in case it's sampled higher -b:a 256k Sets the output bitrate to 256kbps (change this bit if you like) -f mp3 Output as mp3 "${target}" Write the output to the target location

That's not so bad, right? After calling it, we also need to capture the exit code. If it's not 0, then ffmpeg encountered some kind of issue. If so, we delete any output files it creates and return the same exit code - which we handle elsewhere.

Finally, we need a function to optimise images. For this I'm using optipng and jpegoptim to handle optimising JPEGs and PNGs respectively, and ImageMagick for the resizing operation.


# $1 Source compress_image() { source="${1}";

temp_file_png="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.png)";
temp_file_jpeg="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.jpeg)";

convert "${source}" -resize 256x256\> "${temp_file_jpeg}" >&2 &
convert "${source}" -resize 256x256\> "${temp_file_png}" >&2 &
wait

jpegoptim --quiet --all-progressive --preserve "${temp_file_jpeg}" >&2 & optipng -quiet -fix -preserve "${temp_file_png}" >&2 &
wait

read -r size_png _ < <(wc --bytes "${temp_file_png}"); read -r size_jpeg _ < <(wc --bytes "${temp_file_jpeg}");
if [[ "${size_png}" -gt "${size_jpeg}" ]]; then
# JPEG is smaller
rm -rf "${temp_file_png}"; echo "${temp_file_jpeg}";
else
# PNG is smaller
rm -rf "${temp_file_jpeg}"; echo "${temp_file_png}";
fi
}

Unlike the previous functions, this one only takes a source file in. It converts it using that temporary directory we created earlier, and echos the filename of the smallest format found.

It's done in 2 stages. First, the source file is resized to 256x256 (maintaining aspect ratio, and avoiding upscaling smaller images) and written as both a JPEG and a PNG.

Then, jpegoptim and optipng are called on the resulting files. Once done, the filesizes are compared and the filepath to the smallest of the 2 is echoed.

With these in place, we can now write the glue that binds them to the xargs call by filling out process_file. Before we do though, we need to tweak the export statements from earlier to export our library functions we've written - otherwise process_file won't be able to access them since it's wrapped in bash -c '' and xargs. Here's the full list of export directives (directly below the end of process_file):

export temp_dir;
export -f process_file;
export -f compress_image;
export -f do_downmux;
export -f do_copy;
# $1 filename process_file() { filename="${1}";
extension="${filename##*.}"; orig_destination="${output}/${filename}"; destination="${orig_destination}";

echo -n "[file] ${filename}: "; do_downmux=false; # Downmux, but only the bitrate is above 256k if [[ "${extension}" == "flac" ]] || [[ "${extension}" == "ogg" ]] || [[ "${extension}" == "mp3" ]]; then
probejson="$(ffprobe -hide_banner -v quiet -show_format -print_format json "${filename}")";
is_above_256k="$(echo "${probejson}" | jq --raw-output '(.format.bit_rate | tonumber) > 256000')";
exit_code="${?}"; if [[ "${exit_code}" -ne 0 ]]; then
echo -n "ffprobe failed; falling back on ";
do_downmux=false;
elif [[ "${is_above_256k}" == "true" ]]; then do_downmux=true; fi fi if [[ "${do_downmux}" == "true" ]]; then
echo -n "downmuxing/";
destination="${orig_destination%.*}.mp3"; fi # .... } We use 2 variables to keep track of the destination location here, because we may or may not successfully manage to convert any given input file to a different format with a different file extension. We also use ffprobe (part of ffmpeg) and jq (a JSON query and manipulation tool) on audio files to detect the bitrate of input files so that we can avoid remuxing files with a bitrate lower than 256kbps. Once we're determined that, we rewrite the destination filename to include the extension .mp3. Next, we need to deal with the images. We do this in a preprocessing step that comes next: case "${extension}" in
png|jpg|jpeg|JPG|JPEG )
compressed_image="$(compress_image "${filename}")";
compressed_extension="${compressed_image##*.}"; destination="${orig_destination%.*}.${compressed_extension}"; ;; esac If the file is an image, we run it through the image optimiser. Then we look at the file extension of the optimised image, and alter the destination filename accordingly. if [[ -f "${destination}" ]] || [[ -f "${orig_destination}" ]]; then echo "exists in destination; skipping"; return 0; fi destination_dir="$(dirname "${destination}")"; if [[ ! -d "${destination_dir}" ]]; then
mkdir -p "${destination_dir}"; fi Next, we look to see if there's a file in the destination already. If so, then we skip out and don't continue processing the file. If not, we make sure that the parent directory exists to avoid issues later. case "${extension}" in
flac|mp3|ogg )
# Use ffmpeg, but only if necessary
if [[ "${do_downmux}" == "false" ]]; then do_copy "${filename}" "${orig_destination}"; else echo -n "ffmpeg "; do_downmux "${filename}" "${destination}"; exit_code="$?";
if [[ "${exit_code}" -ne 0 ]]; then echo "failed, exit code${exit_code}: falling back on ";
do_copy "${filename}" "${orig_destination}";
fi
fi
;;

png|jpg|jpeg|JPG|JPEG )
mv "${compressed_image}" "${destination}";
;;

* )
do_copy "${filename}" "${destination}";
;;
esac
echo "done";

Finally, we get to the main case statement that handles the different files. If it's an audio file, we run it through do_downmux (which we implemented earlier) - but only if it would benefit us. If it's an image, we move the converted image from the temporary directory that was optimised earlier, and if we can't tell what it is, then we just copy it over directly.

That's process_file completed. Now all we're missing are a few clean-up tasks that make it more cron friendly:

echo "[ ${SECONDS} ] Setting permissions"; chown -R root:root "${output}";
chmod -R 0644 "${output}"; chmod -R ugo+X "${output}";

echo "[ \${SECONDS} ] Portable music copy update complete";

This goes at the end of the file, and it reset the permissions on the output directory to avoid issues. This ensures that everyone can read it, but only root can write to it - as any modifications should be made it to the original version, and not the portable copy.

That completes this script. By understanding how it works, hopefully you'll be able to apply it to your own specific circumstances.

For example, you could call it via cron. Edit your crontab:

sudo crontab -e

...and paste in something like this:

5 4 * * *   /absolute/path/to/script.sh

This won't work if your device isn't turned on at the time, however. In that case, there is alternative. Simply drop the script (without an extension) into /etc/cron.daily or /etc/cron.weekly and mark it executable, and anacron will run your job every day or week respectively.

Anyway, here's the complete script: