Starbeamrainbowlabs

About

Hello!

I am a computer science student who is doing a PhD at the University of Hull. I started out teaching myself about various web technologies, and then I managed to get a place at University, where I am now. I've previously done a degree (BSc Computer Science) and a Masters (MSc Computer Science with Security and Distributed Computing) at the University of Hull. I've done a year in industry too, which I found to be particuarly helpful in learning about the workplace and the world.

I currently know C# + Monogame / XNA (+ WPF), HTML5, CSS3, Javascript (ES6 + Node.js), PHP, C / C++ (mainly for Arduino), and a bit of Python. Oh yeah, and I can use XSLT too.

I love to experiment and learn about new things on a regular basis. You can find some of the things that I've done in the labs and code sections of this website, or on GitHub. My current projects are Pepperminty Wiki, an entire wiki engine in a single file (the source code is spread across multiple files - don't worry!), and Nibriboard (a multi-user real-time infinite whiteboard), although the latter is in its very early stages.

I can also be found in a number of other different places around the web. I've compiled a list of the places that I can remember below.

I can be contacted at the email address webmaster at starbeamrainbowlabs dot com. Suggestions, bug reports and constructive criticism are always welcome.

For those looking for my GPG key, you can find it here. My key id is C2F7843F9ADF9FEE264ACB9CC1C6C0BB001E1725, and is uploaded to the public keyserver network, so you can download it with GPG like so: gpg --recv-keys C2F7843F9ADF9FEE264ACB9CC1C6C0BB001E1725

Blog

Blog Roll | Article Atom Feed | An image of a white letter, representing an email.Mailing List


Latest Post

Making an auto-updated downmuxed copy of my music

I like to buy and own music. That way, if the service goes down, I still get to keep both my music and the rights thereto that I've paid for.

To this end, I maintain an offline collection of music tracks that I've purchased digitally. Recently, it's been growing quite large (~15GiB at the moment) - which is quite a bit of disk space. While this doesn't matter too much on my laptop, on my phone it's quite a different story.

For this reason, I wanted to keep a downmuxed copy of my music collection on my Raspberry Pi 3B+ file server that I can sync to my phone. Said Raspberry Pi already has ffmpeg installed, so I decided to write a script to automate the process. In this blog post, I'm going to walk you through the script itself and what it does - and how you can use it too.

I've decided on a standard downmuxed format of 256kbps MP3. You can choose anything you like - you just need to tweak the appropriate lines in the script.

First, let's outline what we want it to do:

  1. Convert anything that isn't an mp3 to 256kbps mp3 (e.g. ogg, flac)
  2. Downmux mp3 files that are at a bitrate higher than 256kbps
  3. Leave mp3s that are at a bitrate lower than (or equal to) 256kbps alone
  4. Convert and optimise album art to 256x256
  5. Copy any unknown files as-is
  6. If the file exists in the target directory already, don't re-convert it again
  7. Max out the system resources when downmuxing to get it done as fast as possible

With this in mind, let's start outlining a script:

#!/usr/bin/env bash

input="${DIR_INPUT:-/absolute/path/to/Music}"
output="${DIR_OUTPUT:-/absolute/path/to/Music-Portable}"

export input;
export output;

temp_dir="$(mktemp --tmpdir -d "portable-music-copy-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

# Library functions go here

# $1    filename
process_file() {
    filename="${1}";
    extension="${filename##*.}";

    # Process file here
}

export temp_dir;
export -f process_file;

cd "${input}" || { echo "Error: Failed to cd to input directory"; exit 1; };
find  -type f -print0 | nice -n20 xargs -P "$(nproc)" -0 -n1 -I{} bash -c 'process_file "{}"';

# Cleanup here....

Very cool. At the top of the script, we define the input and output directories we're going to work on. We use the ${VARIABLE_NAME:-default_value} syntax to allow for changing the input and output directories on the fly with the DIR_INPUT and DIR_OUTPUT environment variables.

Next, we create a temporary directory, and define an exit trap to ensure it gets deleted when the script exits (regardless of whether the exit is clean or not).

Then, we define the main driver function that will process a single file. This is called by xargs a little further down - which takes the file list in from a find call. The cd is important there, because we want the file paths from find to be relative to the input directory for easier mangling later. The actual process_file call is wrapped in bash -c '', because being a bash function it can't be called by xargs directly - so we have to export -f it and wrap it as shown.

Next, we need to write some functions to handle converting different file types. First, let's write a simple copy function:

# $1    Source
# $2    Target
do_copy() {
    source="${1}";
    target="${2}";

    echo -n "cp ";
    cp "${source}" "${target}";
}

All it does is call cp, but it's nice to abstract like this so that if we wanted to add extra features (e.g. uploading via sftp or something) later, it's not as much of a bother.

We also need to downmux audio files and convert them to mp3. Let's write a function for that too:


# $1    Source
# $2    Target
do_downmux() {
    source="${1}";
    target="${2}";

    set +e;
    ffmpeg -hide_banner -loglevel warning -nostats -i "${source}" -vn -ar 44100 -b:a 256k -f mp3 "${target}";
    exit_code="${?}";
    if [[ "${exit_code}" -ne 0 ]] && [[ -f "${target}" ]]; then
        rm "${target}";
    fi
    return "${exit_code}";
}

It's got the same arguments signature as do_copy, but it downmuxes instead of copying directly. The line that does the magic is highlighted. It looks complicated, but it's actually pretty logical. Let's break down all those arguments:

Argument Purpose
-hide_banner Hides the really rather wordy banner at the top when ffmpeg starts up
-loglevel warning Hides everything but warning messages to avoid too much unreadable output when converting many tracks at once
-nostats As above
-i "${source}" Specifies the input file
-vn Strips any video tracks found
-ar 44100 Force the sampling rate to 44.1KHz, just in case it's sampled higher
-b:a 256k Sets the output bitrate to 256kbps (change this bit if you like)
-f mp3 Output as mp3
"${target}" Write the output to the target location

That's not so bad, right? After calling it, we also need to capture the exit code. If it's not 0, then ffmpeg encountered some kind of issue. If so, we delete any output files it creates and return the same exit code - which we handle elsewhere.

Finally, we need a function to optimise images. For this I'm using optipng and jpegoptim to handle optimising JPEGs and PNGs respectively, and ImageMagick for the resizing operation.


# $1    Source
compress_image() {
    source="${1}";

    temp_file_png="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.png)";
    temp_file_jpeg="$(mktemp --tmpdir="${temp_dir}" XXXXXXX.jpeg)";

    convert "${source}" -resize 256x256\> "${temp_file_jpeg}" >&2 &
    convert "${source}" -resize 256x256\> "${temp_file_png}" >&2 &
    wait

    jpegoptim --quiet --all-progressive --preserve "${temp_file_jpeg}" >&2 &
    optipng -quiet -fix -preserve "${temp_file_png}" >&2 &
    wait

    read -r size_png _ < <(wc --bytes "${temp_file_png}");
    read -r size_jpeg _ < <(wc --bytes "${temp_file_jpeg}");
    if [[ "${size_png}" -gt "${size_jpeg}" ]]; then
        # JPEG is smaller
        rm -rf "${temp_file_png}";
        echo "${temp_file_jpeg}";
    else
        # PNG is smaller
        rm -rf "${temp_file_jpeg}";
        echo "${temp_file_png}";
    fi
}

Unlike the previous functions, this one only takes a source file in. It converts it using that temporary directory we created earlier, and echos the filename of the smallest format found.

It's done in 2 stages. First, the source file is resized to 256x256 (maintaining aspect ratio, and avoiding upscaling smaller images) and written as both a JPEG and a PNG.

Then, jpegoptim and optipng are called on the resulting files. Once done, the filesizes are compared and the filepath to the smallest of the 2 is echoed.

With these in place, we can now write the glue that binds them to the xargs call by filling out process_file. Before we do though, we need to tweak the export statements from earlier to export our library functions we've written - otherwise process_file won't be able to access them since it's wrapped in bash -c '' and xargs. Here's the full list of export directives (directly below the end of process_file):

export temp_dir;
export -f process_file;
export -f compress_image;
export -f do_downmux;
export -f do_copy;
# $1    filename
process_file() {
    filename="${1}";
    extension="${filename##*.}";

    orig_destination="${output}/${filename}";
    destination="${orig_destination}";

    echo -n "[file] ${filename}: ";

    do_downmux=false;
    # Downmux, but only the bitrate is above 256k
    if [[ "${extension}" == "flac" ]] || [[ "${extension}" == "ogg" ]] || [[ "${extension}" == "mp3" ]]; then
        probejson="$(ffprobe -hide_banner -v quiet -show_format -print_format json "${filename}")";
        is_above_256k="$(echo "${probejson}" | jq --raw-output '(.format.bit_rate | tonumber) > 256000')";
        exit_code="${?}";
        if [[ "${exit_code}" -ne 0 ]]; then
            echo -n "ffprobe failed; falling back on ";
            do_downmux=false;
        elif [[ "${is_above_256k}" == "true" ]]; then
            do_downmux=true;
        fi
    fi

    if [[ "${do_downmux}" == "true" ]]; then
        echo -n "downmuxing/";
        destination="${orig_destination%.*}.mp3";
    fi

    # ....
}

We use 2 variables to keep track of the destination location here, because we may or may not successfully manage to convert any given input file to a different format with a different file extension.

We also use ffprobe (part of ffmpeg) and jq (a JSON query and manipulation tool) on audio files to detect the bitrate of input files so that we can avoid remuxing files with a bitrate lower than 256kbps. Once we're determined that, we rewrite the destination filename to include the extension .mp3.

Next, we need to deal with the images. We do this in a preprocessing step that comes next:

case "${extension}" in 
    png|jpg|jpeg|JPG|JPEG )
        compressed_image="$(compress_image "${filename}")";
        compressed_extension="${compressed_image##*.}";
        destination="${orig_destination%.*}.${compressed_extension}";
        ;;
esac

If the file is an image, we run it through the image optimiser. Then we look at the file extension of the optimised image, and alter the destination filename accordingly.

if [[ -f "${destination}" ]] || [[ -f "${orig_destination}" ]]; then
    echo "exists in destination; skipping";
    return 0;
fi

destination_dir="$(dirname "${destination}")";
if [[ ! -d "${destination_dir}" ]]; then
    mkdir -p "${destination_dir}";
fi

Next, we look to see if there's a file in the destination already. If so, then we skip out and don't continue processing the file. If not, we make sure that the parent directory exists to avoid issues later.

case "${extension}" in
    flac|mp3|ogg )
        # Use ffmpeg, but only if necessary
        if [[ "${do_downmux}" == "false" ]]; then
            do_copy "${filename}" "${orig_destination}";
        else
            echo -n "ffmpeg ";
            do_downmux "${filename}" "${destination}";
            exit_code="$?";
            if [[ "${exit_code}" -ne 0 ]]; then
                echo "failed, exit code ${exit_code}: falling back on ";
                do_copy "${filename}" "${orig_destination}";
            fi
        fi
        ;;

    png|jpg|jpeg|JPG|JPEG )
        mv "${compressed_image}" "${destination}";
        ;;

    * )
        do_copy "${filename}" "${destination}";
        ;;
esac
echo "done";

Finally, we get to the main case statement that handles the different files. If it's an audio file, we run it through do_downmux (which we implemented earlier) - but only if it would benefit us. If it's an image, we move the converted image from the temporary directory that was optimised earlier, and if we can't tell what it is, then we just copy it over directly.

That's process_file completed. Now all we're missing are a few clean-up tasks that make it more cron friendly:

echo "[ ${SECONDS} ] Setting permissions";
chown -R root:root "${output}";
chmod -R 0644 "${output}";
chmod -R ugo+X "${output}";

echo "[ ${SECONDS} ] Portable music copy update complete";

This goes at the end of the file, and it reset the permissions on the output directory to avoid issues. This ensures that everyone can read it, but only root can write to it - as any modifications should be made it to the original version, and not the portable copy.

That completes this script. By understanding how it works, hopefully you'll be able to apply it to your own specific circumstances.

For example, you could call it via cron. Edit your crontab:

sudo crontab -e

...and paste in something like this:

5 4 * * *   /absolute/path/to/script.sh

This won't work if your device isn't turned on at the time, however. In that case, there is alternative. Simply drop the script (without an extension) into /etc/cron.daily or /etc/cron.weekly and mark it executable, and anacron will run your job every day or week respectively.

Anyway, here's the complete script:

Sources and Further Reading


By on

Labs

Code

Tools

I find useful tools on the internet occasionally. I will list them here.

Art by Mythdael