Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code performance photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Own your code, part 4: Laminar CI

In the last post, I talked at a high level about the infrastructure behind my continuous integration and deployment system. In this post, I'm going to dive into the details of the Laminar CI job is the engine that drives the whole system.

Laminar CI is based on a concept of jobs. The docs explain it quite well, but in short each job is a file in the jobs folder with the file extension run and a shebang. In my case, I'm using Bash - and I'll continue to do so at regular intervals throughout this series.

Unlike most other setups, the Laminar CI job that we'll be writing here won't actually do any of the actual CI tasks itself - it will simply act as a proxy script to setup & manage the execution of the actual build system - which, in this case, will be the lantern build engine, an engine I wrote to aid me with automating repetitive tasks when working on my University ACWs (Assessed CourseWork).

Every job has it's own workspace, which acts as a common area to store and cache various files across all the runs of that job. Each run of a job also has it's very own private area too - which will be useful later on.

The first step in this proxy script is to extract the parameters of the run that we're supposed to be doing. For me, I store this in a number of environment variables, which are set when queuing the job run from the git post-receive (or web) hook:

Variable Example Description
GIT_REPO_NAME git-starbeamrainbowlabs-com-sbrl-rhinoreminds The safe name of the repository that we're running against, with potentially troublesome characters removed.
GIT_REF_NAME refs/heads/master Basically the branch that we're working on. Useful for logging purposes.
GIT_REPO_URL git@git.starbeamrainbowlabs.com:sbrl/rhinoreminds.git The URL of the repository that we're running against.
GIT_COMMIT_REF e23b2e0.... The exact commit to check out and build.
GIT_AUTHOR The friendly name of the author that pushed the commit. Useful for logging purposes.

Before we do anything else, we need to make sure that these variables are defined:

set -e; # Don't allow errors

# Check that all the right variables are present
if [ -z "${GIT_REPO_NAME}" ]; then echo -e "Error: The environment variable GIT_REPO_NAME isn't set." >&2; exit 1; fi
if [ -z "${GIT_REF_NAME}" ]; then echo -e "Error: The environment variable GIT_REF_NAME isn't set." >&2; exit 1; fi
if [ -z "${GIT_REPO_URL}" ]; then echo -e "Error: The environment variable GIT_REPO_URL isn't set." >&2; exit 1; fi
if [ -z "${GIT_COMMIT_REF}" ]; then echo -e "Error: The environment variable GIT_COMMIT_REF isn't set." >&2; exit 1; fi
if [ -z "${GIT_AUTHOR}" ]; then echo -e "Error: The environment variable GIT_AUTHOR isn't set." >&2; exit 1; fi

There are a bunch of other variables that I'm omitting here, since they are dynamically determined by from the build variables. I extract many of these additional variables using regular expressions. For example:

GIT_REF_TYPE="$(regex_match "${GIT_REF_NAME}" 'refs/([a-z]+)')";

GIT_REF_TYPE is the bit after the refs/ and before the actual branch or tag name. It basically tells us whether we're building against a branch or a tag. That regex_match function is a utility function that I found in the pure bash bible - which is an excellent resource on various tips and tricks to do common tasks without spawning subprocesses - and therefore obtaining superior performance and lower resource usage. Here it is:

# @source https://github.com/dylanaraps/pure-bash-bible#use-regex-on-a-string
# Usage: regex "string" "regex"
regex_match() {
    [[ $1 =~ $2 ]] && printf '%s\n' "${BASH_REMATCH[1]}"
}

Very cool. For completeness, here are the remainder of the secondary environment variables. Many of them aren't actually used directly - instead they are used indirectly by other scripts and lantern build engine tasks that we call from the main Laminar CI job.

if [[ "${GIT_REF_TYPE}" == "tags" ]]; then
    GIT_TAG_NAME="$(regex_match "${GIT_REF_NAME}" 'refs/tags/(.*)$')";
fi

# NOTE: These only work with SSH urls.
GIT_REPO_OWNER="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=:)[^/]+(?=/)')";
GIT_REPO_NAME_SHORT="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=/)[^/]+(?=\.git$)')";
GIT_SERVER_DOMAIN="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=@)[^/]+(?=:)')";

GIT_TAG_NAME is the name of the tag that we're building against - but only if we've been passed a tag as the GIT_REF_TYPE.

The GIT_SERVER_DOMAIN is important for sending the status reports to the right place. Gitea supports a status API that we can hook into to report on how we're doing. You can see it in action here on my RhinoReminds repository. Those green ticks are the build status that was reported by the Laminar CI job that we're writing in this post. Unfortunately you won't be able to click on it to see the actual build output, as that is currently protected behind a username and password, since the Laminar CI web interface exposes all the git project I've currently got setup on it - including a number of private ones that I can't share.

Anyway, with all our environment variables in order, it's time to do something with them. Before we do though, we should tell Gitea that we're starting the build process:

send-status-gitea "${GIT_COMMIT_REF}" "pending" "Executing build....";

I haven't yet implemented support for sending notifications to GitHub, but it's on my todo list. In theory it's pretty easy to do - this is why I've got that GIT_SERVER_DOMAIN variable above in anticipation of this.

That send-status-gitea function there is another helper script I've written that does what you'd expect - it sends a status message to Gitea. It does this by using the environment variables we deduced earlier (that are also exported - though I didn't include that in the abovecode snippet) and curl.

There's still a bunch of stuff to get through in this post, so I'm going to omit the source of that script from this post for brevity. I've got no particular issue with releasing it though - if you're interested, contact me using the details on my homepage.

Next, we need to set an exit trap. This is a function that will run when the Bash process exits - regardless of whether this was because we finished our work successfully, or otherwise. This can be very useful to make absolutely sure that your script cleans up after itself. In our case, we're only going to be using it to report the build status back to Gitea:

# Runs on exit, no matter what
cleanup() {
    original_exit_code="$?";

    status="success";
    description="Build ${RUN} succeeded in $(human-duration "${SECONDS}").";
    if [[ "${original_exit_code}" -ne "0" ]]; then
        status="failed";
        description="Build failed with exit code ${original_exit_code} after $(human-duration "${SECONDS}")";
    fi

    send-status-gitea "${GIT_COMMIT_REF}" "${status}" "${description}";
}

trap cleanup EXIT;

Very cool. The RUN variable there is provided by Laminar CI, and SECONDS is a bash built-in that tells us the number of seconds that the current Bash process has been running for. human-duration is yet another helper script because I like nice readable durations in my status messages - not something unreadable like Build 3 failed in 345 seconds. It's also somewhat verbose - I adapted it from this StackExchange answer.

With that all out of the way, the next item on the list is to work out what job name we're running under. I've chosen git-repo for the name of the master 'virtual' job - that is to say the one whose entire purpose is to queue the actual job. That's pretty easy, since Laminar gives us an environment variable:

if [ "${JOB}" == "git-repo" ]; then
    # ...
fi

If the job name is git-repo, then we need to queue the actual job name. Since I don't want to have to manually alter the system every time I'm setting up a new repo on my CI system, I've automated the process with symbolic links. The main git-repo job creates a symbolic link to itself in the name of the repository that it's supposed to be running against, and then queues a new job to run itself under the different job name. This segment takes place nested in the above if statement:

# If the job file doesn't exist, create it
# We create a symlink here because this is a 'smart' job - whose
# behaviour changes dynamically based on the job name.
if [ ! -e "${LAMINAR_HOME}/cfg/jobs/${repo_job_name}.run" ]; then
    pushd "${LAMINAR_HOME}/cfg/jobs";
    ln -s "git-repo.run" "${repo_job_name}.run";
    popd
fi

Once we're sure that the symbolic link is in place, we can queue the virtual copy:

# Queue our new hologram
LAMINAR_REASON="git push by ${GIT_AUTHOR} to ${GIT_REF_NAME}" laminarc queue "${repo_job_name}" GIT_REPO_NAME="${GIT_REPO_NAME}" GIT_REF_NAME="${GIT_REF_NAME}" GIT_REPO_URL="${GIT_REPO_URL}" GIT_COMMIT_REF="${GIT_COMMIT_REF}" GIT_AUTHOR="${GIT_AUTHOR}";
# If we got to here, we queued the hologram successfully
# Clear the trap, because we know that the trap for the hologram will fire
# This avoids sending a 2nd status to Gitea, linking the user to the wrong place
trap - EXIT;

exit 0;

This also ensures that if we make any changes to the main job file, all the copies will get updated automatically too. After all, they are only pointers to the actual job on disk.

Notice that we also clear the trap there before exiting - that's important, since we're queuing a copy of ourselves, we don't want to report the completed status before we've actually finished.

At this point, we can now look at what happens if the job name isn't git-repo. In this case, we need to do a few things:

  1. Clone the git repository in question to the shared workspace (if it hasn't been done already)
  2. Fetch new commits on the shared repository copy
  3. Check out the right commit
  4. Copy it to the run-specific directory
  5. Execute the build script

Additionally, we need to ensure that points #1 to #4 are not done by multiple jobs that are running at the same time, since that would probably confuse things and induce weird and undesirable behaviour. This might happen if we push multiple commits at once, for example - since the git post-receive hook (which I'll be talking about in a future post) queues 1 run per commit.

We can make sure of this by using flock. It's actually a feature provided by the Linux Kernel, which allows a single process to obtain exclusive access to a resource on disk. Since each Laminar job has it's own workspace as described above, we can abuse this by doing an flock on the workspace directory. This will ensure that only 1 run per job is accessing the workspace area at once:

# Acquire a lock for this repo
exec 9<"${WORKSPACE}";
flock --exclusive 9;

echo "[${SECONDS}] Lock acquired";

Nice. Next, we need to clone the repository into the shared workspace if we haven't already:

cd "${WORKSPACE}";

# If we haven't already, clone the repository
git_directory="$(echo "${GIT_REPO_URL}" | grep -oP '(?<=/)(.+)(?=.git$)')";
if [ ! -d "${git_directory}" ]; then
    echo "[${SECONDS}] Cloning repository";
    git clone "${GIT_REPO_URL}";
fi
cd "${git_directory}";

Then, we need to fetch any new commits:

# Pull down any updates that are available
echo "[${SECONDS}] Downloading commits";
git fetch origin;

....and check out the one we're supposed to be building:

# Checkout the commit we're interested in testing
echo "[${SECONDS}] Checking out ${GIT_COMMIT_REF}";
git checkout "${GIT_COMMIT_REF}";

Then, we need to copy the repo to the run-specific directory. This is important, since the run might create new files - and we don't want multiple runs running in the same directory at the same time.


echo "[${SECONDS}] Linking source to run directory";
# Hard-link the repo content to the run directory
# This is important because then we can allow multiple runs of the same repo at the same time without using extra disk space
# -r    Recursive mode
# -a    Preserve permissions
# -l    Hardlink instead of copy
cp -ral ./ "${run_directory}";
# Don't forget the .git directory, .gitattributes, .gitmodules, .gitignore, etc.
# This is required for submodules and other functionality, but likely won't be edited - hence we can hardlink here (I think).
# NOTE: If we see weirdness with multiple runs at a time, then we'll need to do something about this.
cp -ral ./.git* "${run_directory}/.git";

I'm using hard linking here for efficiency - I'm banking on the fact that the build script I call isn't going to modify any existing files. Thinking about it, I should do a git reset --hard there just in case - though then I'd have all sorts of nasty issues with timing problems.

So far, I haven't had any issues. If I do, then I'll just disable the hard linking and copy instead. This entire script assumes a trusted environment - i.e. it trusts that the code being executed is not malicious. To this end, it's only suitable for personal projects and the like.

For it to be useful in untrusted environments, it would need to avoid hard linking and execute the build script inside a container - e.g. using LXD or Docker.

Moving on, we next need to release that flock and return to the run-specific directory:

# Go back to the job-specific run directory
cd "${run_directory}";

# Release the lock
exec 9>&- # Close file descriptor 9 and release lock

echo "[${SECONDS}] Lock released";

At this point, we're all set up to run the build script. We need to find it first though. I've currently got 2 standards I'm using across my repositories: build and build.sh. This is easy to automate:

build_script="./build";
if [ ! -x "${build_script}" ]; then build_script="./build.sh"; fi
# FUTURE: Add Makefile support here?
if [ ! -x "${build_script}" ]; then
    echo "[${SECONDS}] Error: Couldn't find the build script, or it wasn't marked as executable." >&2;
    exit 1;
fi

Now that we know where it is, we can execute it. Before we do though, as a little extra I like to run shellcheck over it - since we assume that it's a shell script too (though it might call something that isn't a shell script):

echo "----------------------------------------------------------------";
echo "------------------ Shellcheck of build script ------------------";
set +e; # Allow shellcheck errors - we just warn about them
shellcheck "${build_script}";
set -e;
echo "----------------------------------------------------------------";

I can highly recommend shellcheck - it finds a number of potential issues in both style and syntax that might cause your shell scripts to behave in unexpected ways. I've learnt a bunch about shell scripting and really improved my skills from using it on a regular basis.

Finally, we can now actually execute the build script:

echo "[${SECONDS}] Executing '${build_script} ci'";

nice -n10 ${build_script} ci

I pass the argument ci here, since the lantern build engine takes task names as arguments on the command line. If it's not a lantern script, then it can be interpreted as a helpful hint as to the environment that it's running in.

I also nice it to push it into the background, since I actually have my Laminar CI server running on a Raspberry Pi and it's resources are rather limited. I found oddly that I'd lose other essential services (e.g. SSH) if I didn't do this for some reason - since build tasks are usually quite computationally expensive.

That completes the build script. Of course, when the above finishes executing the trap that we set earlier will trigger and the build status reported. I'll include the full script at the bottom of this post.

This was a long post! We've taken a deep dive into the engine that powers my build system. In the next few posts, I'd like to talk about the git post-receive hook I've been mentioning that triggers this job. I'd also like to talk formally about the lantern build engine - what it is, where it came from, and how it works.

Found this interesting? Spotted a mistake? Got a suggestion? Confused about something? Comment below!

#!/usr/bin/env bash
set -e; # Don't allow errors

# Check that all the right variables are present
if [ -z "${GIT_REPO_NAME}" ]; then echo -e "Error: The environment variable GIT_REPO_NAME isn't set." >&2; exit 1; fi
if [ -z "${GIT_REF_NAME}" ]; then echo -e "Error: The environment variable GIT_REF_NAME isn't set." >&2; exit 1; fi
if [ -z "${GIT_REPO_URL}" ]; then echo -e "Error: The environment variable GIT_REPO_URL isn't set." >&2; exit 1; fi
if [ -z "${GIT_COMMIT_REF}" ]; then echo -e "Error: The environment variable GIT_COMMIT_REF isn't set." >&2; exit 1; fi
if [ -z "${GIT_AUTHOR}" ]; then echo -e "Error: The environment variable GIT_AUTHOR isn't set." >&2; exit 1; fi

# It's checked directly anyway
# shellcheck disable=SC1091
source source_regex_match.sh;

GIT_REF_TYPE="$(regex_match "${GIT_REF_NAME}" 'refs/([a-z]+)')";

if [[ "${GIT_REF_TYPE}" == "tags" ]]; then
    GIT_TAG_NAME="$(regex_match "${GIT_REF_NAME}" 'refs/tags/(.*)$')";
fi

# NOTE: These only work with SSH urls.
GIT_REPO_OWNER="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=:)[^/]+(?=/)')";
GIT_REPO_NAME_SHORT="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=/)[^/]+(?=\.git$)')";
GIT_SERVER_DOMAIN="$(echo "${GIT_REPO_URL}" | grep -Po '(?<=@)[^/]+(?=:)')";

export GIT_REPO_OWNER GIT_REPO_NAME_SHORT GIT_SERVER_DOMAIN GIT_REF_TYPE GIT_TAG_NAME;

###############################################################################

# Example URL: git@git.starbeamrainbowlabs.com:sbrl/rhinoreminds.git
# Environment variables:
#   GIT_REPO_NAME           git-starbeamrainbowlabs-com-sbrl-rhinoreminds
#   GIT_REF_NAME            refs/heads/master, refs/tags/v0.1.1-build7
#   GIT_REF_TYPE            heads, tags
#       Determined dynamically from GIT_REF_NAME.
#   GIT_TAG_NAME            v0.1.1-build7
#       Determined dynamically from GIT_REF_NAME, only set if GIT_REF_TYPE == "tags".
#   GIT_REPO_URL            git@git.starbeamrainbowlabs.com:sbrl/rhinoreminds.git
#   GIT_COMMIT_REF          e23b2e0f3c0b9f48effebca24db48d9a3f028a61
#   GIT_AUTHOR              bob
# Generated:
#   GIT_SERVER_DOMAIN       git.starbeamrainbowlabs.com
#   GIT_REPO_OWNER          sbrl
#   GIT_REPO_NAME_SHORT     rhinoreminds
#   GIT_RUN_SOURCE          github
#       Not always set. If not set then assume git.starbeamrainbowlabs.com


send-status-gitea "${GIT_COMMIT_REF}" "pending" "Executing build....";

# Runs on exit, no matter what
cleanup() {
    original_exit_code="$?";

    status="success";
    description="Build ${RUN} succeeded in $(human-duration "${SECONDS}").";
    if [[ "${original_exit_code}" -ne "0" ]]; then
        status="failed";
        description="Build failed with exit code ${original_exit_code} after $(human-duration "${SECONDS}")";
    fi

    send-status-gitea "${GIT_COMMIT_REF}" "${status}" "${description}";
}

trap cleanup EXIT;

###############################################################################


repo_job_name="$(echo "${GIT_REPO_NAME}" | tr '/' '--')";
if [ "${JOB}" == "git-repo" ]; then
    # If the job file doesn't exist, create it
    # We create a symlink here because this is a 'smart' job - whose
    # behaviour changes dynamically based on the job name.
    if [ ! -e "${LAMINAR_HOME}/cfg/jobs/${repo_job_name}.run" ]; then
        pushd "${LAMINAR_HOME}/cfg/jobs";
        ln -s "git-repo.run" "${repo_job_name}.run";
        popd
    fi

    # Queue our new hologram
    LAMINAR_REASON="git push by ${GIT_AUTHOR} to ${GIT_REF_NAME}" laminarc queue "${repo_job_name}" GIT_REPO_NAME="${GIT_REPO_NAME}" GIT_REF_NAME="${GIT_REF_NAME}" GIT_REPO_URL="${GIT_REPO_URL}" GIT_COMMIT_REF="${GIT_COMMIT_REF}" GIT_AUTHOR="${GIT_AUTHOR}";
    # If we got to here, we queued the hologram successfully
    # Clear the trap, because we know that the trap for the hologram will fire
    # This avoids sending a 2nd status to Gitea, linking the user to the wrong place
    trap - EXIT;

    exit 0;
fi

# We're running in hologram mode!

# Remember the run directory - we'll need it later
run_directory="$(pwd)";

# Important directories:
# $WORKSPACE        Shared between all runs of a job
# $run_directory    The initial directory a run lands in. Empty and run-specific.
# $ARCHIVE          Also run-speicfic, but the contents is persisted after the run ends


# Acquire a lock for this repo
#laminarc lock "${JOB}-workspace";
exec 9<"${WORKSPACE}";
flock --exclusive 9;
###############################################################################
# No need to allow errors here, because the lock will automagically be released 
# if the process crashes, as that'll close the file description anyway :P
echo "[${SECONDS}] Lock acquired";

cd "${WORKSPACE}";

# If we haven't already, clone the repository
git_directory="$(echo "${GIT_REPO_URL}" | grep -oP '(?<=/)(.+)(?=.git$)')";
if [ ! -d "${git_directory}" ]; then
    echo "[${SECONDS}] Cloning repository";
    git clone "${GIT_REPO_URL}";
fi
cd "${git_directory}";


# Pull down any updates that are available
echo "[${SECONDS}] Downloading commits";
git fetch origin;
# Checkout the commit we're interested in testing
echo "[${SECONDS}] Checking out ${GIT_COMMIT_REF}";
git checkout "${GIT_COMMIT_REF}";

echo "[${SECONDS}] Linking source to run directory";
# Hard-link the repo content to the run directory
# This is important because then we can allow multiple runs of the same repo at the same time without using extra disk space
# -r    Recursive mode
# -a    Preserve permissions
# -l    Hardlink instead of copy
cp -ral ./ "${run_directory}";
# Don't forget the .git directory, .gitattributes, .gitmodules, .gitignore, etc.
# This is required for submodules and other functionality, but likely won't be edited - hence we can hardlink here (I think).
# NOTE: If we see weirdness with multiple runs at a time, then we'll need to do something about this.
cp -ral ./.git* "${run_directory}/.git";
echo "[${SECONDS}] done";

# Go back to the job-specific run directory
cd "${run_directory}";

###############################################################################
# Release the lock
exec 9>&- # Close file descriptor 9 and release lock
#laminarc release "${JOB}-workspace";


echo "[${SECONDS}] Lock released";

echo "[${SECONDS}] Finding build script";

build_script="./build";
if [ ! -x "${build_script}" ]; then build_script="./build.sh"; fi
# FUTURE: Add Makefile support here?
if [ ! -x "${build_script}" ]; then
    echo "[${SECONDS}] Error: Couldn't find the build script, or it wasn't marked as executable." >&2;
    exit 1;
fi


echo "[${SECONDS}] Executing '${build_script} ci'";

echo "----------------------------------------------------------------";
echo "------------------ Shellcheck of build script ------------------";
set +e; # Allow shellcheck errors - we just warn about them
shellcheck "${build_script}";
set -e;
echo "----------------------------------------------------------------";


nice -n10 ${build_script} ci

Own your code, part 3: Shell scripting infrastructure

In the last post, I told the curious tale of my unreliable webhook. In the post before that, I talked about my Gitea-powered git server and how I set it up. In this one, we're going to back up a bit and look at setting up Laminar CI.

Laminar CI is a continuous integration program that takes a decidedly different approach to the one you see in solutions like GitLab CI and Travis. It takes a much more minimal approach, instead preferring to provide you with the tools you need and letting you get on with setting it up however you like and integrating it with whatever you like.

I recommend looking at its website and user manual to get a feel for how it works. In short, it lets you do things like this:

laminarc queue build-code
laminarc show-jobs

It is, of course, entirely command-line based. In order to integrate it with other services, webhooks are needed. In my case, I've used webhook for this purpose. Before we get into that though, we should outline how the system we build should work.

For me, I'm not happy with filling a folder with job scripts. I want my CI system to have the following properties:

  • I want to have all the configuration files and scripts under version control
  • I want to keep project-specific CI scripts in their appropriate repositories
  • I don't want to have to alter the CI configuration every time I start a new project.
  • The CI system should be stable - it shouldn't fall over if multiple jobs for the same repository are running at the same time.

Bold claims. Achieving this was actually quite complicated, and demanded a pretty sophistic infrastructure that's comprised of multiple independent shell scripts. It's best explained with a diagram:

There are 3 different machines at play here.

  1. The local computer, where we write our code
  2. The git server, where the code is stored
  3. The CI Server, which runs continuous integration tasks

Somehow, we need to notify the CI server that it needs to do something when new commits end up at the git server. We can achieve this with a git post-receive hook, which is basically a shell script (yep, we'll be seeing a lot of those) that can perform some logic on the server just after a push is complete, but the client pushing them hasn't disconnected yet.

In this post-receive hook we need to trigger the webhook that notifies the CI server that there are new commits for it to test. GitHub makes this easy, as it provides a webhook system where you can configure a webhook via a GUI - but it doesn't let you set the webhook script directly as far as I know.

Alternatively, should we run into issues with the webhook and we have control over the git server, we can trigger the CI build directly by writing a post-receive git hook directly utilising SSH port forwarding. This is what I did in the end for my personal git server, though as I noted in part 2 I did end up working around the webhook issues so that I could have it work with GitHub too.

For the webhook to work, we'll need a receiving script that will parse the JSON body of the webhook itself, and queue the laminar job.

In order for the laminar job to work without modification when we add a new project, it will have to come in 2 parts. The first will have a generic 'virtual' or 'smart' job, which should create a symbolic link to itself under the name of the repository that we want to run CI tasks for.

When called by Laminar under a repository-specific name, we want to run the CI tasks - but only on a copy of the main repository. Additionally, we don't want to re-clone the repository each time - this is slow and wastes bandwidth.

Finally, we need a unified standard for defining CI tasks in our repositories for which we want to enable continuous integration.

This we can achieve with the use of the lantern build engine, which I'll talk about (and its history!) in a future post.

Putting all this together has been quite the the undertaking - hence this series of blog posts! For now, since this blog post somehow seems to be getting rather long already and we've laid down the foundations of quite the complicated system, I think I'll leave this post here for now.

In the next post, we'll look at building the core of the system: The main laminar CI job that will organise the execution of project-specific CI tasks.

Own your code, Part 2: The curious case of the unreliable webhook

In the last post, I talked about how to setup your own Git server with Gitea. In this one, I'm going to take bit of a different tack - and talk about one of the really annoying problems I ran into when setting up my continuous integration server, Laminar CI.

Since I wanted to run the continuous integration server on a different machine to the Gitea server itself, I needed a way for the Gitea server to talk to the CI server. The natural choice here is, of course, a Webhook-based system.

After installing and configuring Webhook on the CI server, I set to work writing a webhook receiver shell script (more on this in a future post!). Unfortunately, it turned out that that Gitea didn't like sending to my CI server very much:

A ton of failed attempts at sending a webhook to the CI server

Whether it succeeded or not was random. If I hit the "Test Delivery" button enough times, it would eventually go through. My first thought was to bring up the Gitea server logs to see if it would give any additional information. It claimed that there was an i/o timeout communicating with the CI server:

Delivery: Post https://ci.bobsrockets.com/hooks/laminar-config-check: read tcp 5.196.73.75:54504->x.y.z.w:443: i/o timeout

Interesting, but not particularly helpful. If that's the case, then I should be able to get the same error with curl on the Gitea server, right?

curl https://ci.bobsrockets.com/hooks/testhook

.....wrong. It worked flawlessly. Every time.

Not to be beaten by such an annoying issue, I moved on to my next suspicion. Since my CI server is unfortunately behind NAT, I checked the NAT rules on the router in front of it to ensure that it was being exposed correctly.

Unfortunately, I couldn't find anything wrong here either! By this point, it was starting to get really rather odd. As a sanity check, I decided to check the server logs on the CI server, since I'm running Webhook behind Nginx (as a reverse-proxy):

5.196.73.75 - - [04/Dec/2018:20:48:05 +0000] "POST /hooks/laminar-config-check HTTP/1.1" 408 0 "-" "GiteaServer"

Now that's weird. Nginx has recorded a HTTP 408 error. Looking is up reveals that it's a Request Timeout error, which has the following definition:

The server did not receive a complete request message within the time that it was prepared to wait.

Wait what? Sounds to me like there's an argument going on between the 2 servers here - in which each server is claiming that the other didn't send a complete request or response.

At this point, I blamed this on a faulty HTTP implementation in Gitea, and opened an issue.

As a workaround, I ended up configuring Laminar to use a Unix socket on disk (as opposed to an abstract socket), forwarding it over SSH, and using a git hook to interact with it instead (more on how I managed this in a future post. There's a ton of shell scripting that I need to talk about first).

This isn't the end of this tail though! A month or two after I opened the issue, I wound up in the situation whereby I wanted to connect a GitHub repository to my CI server. Since I don't have shell access on github.com, I had to use the webhook.

When I did though, I got a nasty shock: The webhook deliveries exhibited the exact same random failures as I saw with the Gitea webhook. If I'd verified the Webhook server and cleared Gitea's HTTP implementation's name, then what else could be causing the problem?

At this point, I can only begin to speculate what the issue is. Personally, I suspect that it's a bug in the port-forwarding logic of my router, whereby it drops the first packet from a new IP address while it sets up a new NAT session to forward the packets to the CI server or something - so subsequent requests will go through fine, so long as they are sent within the NAT session timeout and from the same IP. If you've got a better idea, please comment below!

Of course, I really wanted to get the GitHub repository connected to my CI server, and if the only way I could do this was with a webhook, it was time for some request-wrangling.

My solution: A PHP proxy script running on the same server as the Gitea server (since it has a PHP-enabled web server set up already). If said script eats the request and emits a 202 Accepted immediately, then it can continue trying to get a hold of the webhook on the CI server 'till the cows come home - and GitHub will never know! Genius.

PHP-FPM (the fastcgi process manager; great alongside Nginx) makes this possible with the fastcgi_finish_request() method, which both flushes the buffer and ends the request to the client, but doesn't kill the PHP script - allowing for further processing to take place without the client having to wait.

Extreme caution must be taken with this approach however, as it can easily lead to a situation where the all the PHP-FPM processes are busy waiting on replies from the CI server, leaving no room for other requests to be fulfilled and a big messy pile-up in the queue forming behind them.

Warnings aside, here's what I came up with:

<?php

$settings = [
    "target_url" => "https://ci.bobsrockets.com/hooks/laminar-git-repo",
    "response_message" => "Processing laminar job proxy request.",
    "retries" => 3,
    "attempt_timeout" => 2 // in seconds, for a single attempt
];

$headers = "host: ci.starbeamrainbowlabs.com\r\n";
foreach(getallheaders() as $key => $value) {
    if(strtolower($key) == "host") continue;
    $headers .= "$key: $value\r\n";
}
$headers .= "\r\n";

$request_content = file_get_contents("php://input");

// --------------------------------------------

http_response_code(202);
header("content-type: text/plain");
header("content-length: " . strlen($settings["response_message"]));
echo($settings["response_message"]);

fastcgi_finish_request();

// --------------------------------------------

function log_message($msg) {
    file_put_contents("ci-requests.log", $msg, FILE_APPEND);
}

for($i = 0; $i < $settings["retries"]; $i++) {
    $start = microtime(true);

    $context = stream_context_create([
        "http" => [
            "header" => $headers,
            "method" => "POST",
            "content" => $request_content,
            "timeout" => $settings["attempt_timeout"]
        ]
    ]);

    $result = file_get_contents($settings["target_url"], false, $context);

    if($result !== false) {
        log_message("[" . date("r") . "] Queued laminar job in " . (microtime(true) - $start_time)*1000 . "ms");
        break;
    }


    log_message("[" . date("r") . "] Failed to laminar job after " . (microtime(true) - $start_time)*1000 . "ms.");
}

I've named it autowrangler.php. A few things of note here:

  • php://input is a special virtual file that's mapped internally by PHP to the client's request. By eating it with file_get_contents(), we can get the entire request body that the client has sent to us, so that we can forward it on to the CI server.
  • getallheaders() lets us get a hold of all the headers sent to us by the client for later forwarding
  • I use log_message() to keep a log of the successes and failures in a log file. So far I've got a ~32% failure rate, but never more than 1 failure in a row - giving some credit to my earlier theory I talked about above.

This ends the tale of the recalcitrant and unreliable webhook. Hopefully you've found this an interesting read. In future posts, I want to look at how I configured Webhook, the inner workings of the git hook I mentioned above, and the collection of shell scripts I've cooked to that make my CI server tick in a way that makes it easy to add new projects quickly.

Found this interesting? Run into this issue yourself? Found a better solution workaround? Comment below!

Own your Code, Part 1: Git Hosting - How did we get here?

Somewhat recently, I posted about how I fixed a nasty problem with an lftp upload. I mentioned that I'd been setting up continuous deployment for an application that I've been writing.

There's actually quite a bit of a story behind how I got to that point, so I thought I'd post about it here. Starting with code hosting, I'm going to show how I setup my own private git server, followed by Laminar (which, I might add, is not for everyone. It's actually quite involved), and finally I'll take a look at continuous deployment.

The intention is to do so in a manner that enables you to do something similar for yourself too (If you have any questions along the way, comment below!).

Of course, this is far too much to stuff into a single blog post - so I'll be splitting it up into a little bit of a mini-series.

Personally, I use git for practically all the code I write, so it makes sense for me to use services such as GitLab and GitHub for hosting these in a public place so that others can find them.

This is all very well, but I do find that I've acquired a number of private projects (say, for University work) that I can't / don't want to open-source. In addition, I'd feel a lot better if I had a backup mirror of the important code repositories I host on 3rd party sites - just in case.

This is where hosting one's own git server comes into play. I've actually blogged about this before, but since then I've moved from Go Git Service to Gitea, a fork of Gogs though a (rather painful; also this) migration.

This post will be more of a commentary on how I went about it, whilst giving some direction on how to do it for yourself. Every server is very different, which makes giving concrete instructions challenging. In addition, I ended up with a seriously non-standard install procedure - which I can't recommend! I need to get around to straightening a few things out at some point.....

So without further hesitation, let's setup Gitea as our Git server! To do so, we'll need an Nginx web server setup already. If you haven't, try following this guide and then come back here.

DNS

Next, you'll need to point a new subdomain at your server that's going to be hosting your Git server. If you've already got a domain name pointed at it (e.g. with A / AAAA records), I can recommend using a CNAME record that points at this pre-existing domain name.

For example, if I have a pair of records for control.bobsrockets.com:

A       control.bobsrockets.com.    1.2.3.4
AAAA    control.bobsrockets.com.    2001::1234:5678

...I could create a symlink like this:

CNAME   git.bobsrockets.com         control.bobsrockets.com.

(Note: For the curious, this isn't actually official DNS record syntax. It's just pseudo-code I invented on-the-fly)

Installation

With that in place, the next order of business is actually installing Gitea. This is relatively simple, but a bit of a pain - because native packages (e.g. sudo apt install ....) aren't a thing yet.

Instead, you download a release binary from the releases page. Once done, we can do some setup to get all our ducks in a row. When setting it up myself, I ended up with a rather weird configuration - as I actually started with a Go Git Service instance before Gitea was a thing (and ended up going through a rather painful) - so you should follow their guide and have a 'normal' installation :P

Once done, you should have Gitea installed and the right directory structure setup.

A note here is that if you're like me and you have SSH running on a non-standard port, you've got 2 choices. Firstly, you can alter the SSH_PORT directive in the configuration file (which should be called app.ini) to match that of your SSH server.

If you decide that you want it to run it's own inbuilt SSH server on port 22 (or any port below 1024), what the guide doesn't tell you is that you need to explicitly give the gitea binary permission to listen on a privileged port. This is done like so:

setcap 'cap_net_bind_service=+ep' gitea

Note that every time you update Gitea, you'll have to re-run that command - so it's probably a good idea to store it in a shell script that you can re-execute at will.

At this point it might also be worth looking through the config file (app.ini I mentioned earlier). There's a great cheat sheet that details the settings that can be customised - some may be essential to configuring Gitea correctly for your environment and use-case.

Updates

Updates to Gitea are, of course, important. GitHub provides an Atom Feed that you can use to keep up-to-date with the latest releases.

Later on this series, we'll take a look at how we can automate the process by taking advantage of cron, Laminar CI, and fpm - amongst other tools. I haven't actually done this yet as of the time of typing and we've got a looong way to go until we get to that point - so it's a fair ways off.

Service please!

We've got Gitea installed and we've considered updates, so the natural next step is to configure it as a system service.

I've actually blogged about this process before, so if you're interested in the details, I recommend going and reading that article.

This is the service file I use:

[Unit]
Description=Gitea
After=syslog.target
After=rsyslog.service
After=network.target
#After=mysqld.service
#After=postgresql.service
#After=memcached.service
#After=redis.service

[Service]
# Modify these two values and uncomment them if you have
# repos with lots of files and get an HTTP error 500 because
# of that
###
#LimitMEMLOCK=infinity
#LimitNOFILE=65535
Type=simple
User=git
Group=git
WorkingDirectory=/srv/git/gitea
ExecStart=/srv/git/gitea/gitea web
Restart=always
Environment=USER=git HOME=/srv/git

[Install]
WantedBy=multi-user.target

I believe I took it from here when I migrated from Gogs to Gitea. Save this as /etc/systemd/system/gitea.service, and then do this:

sudo systemctl daemon-reload
sudo systemctl start gitea.service

This should start Gitea as a system service.

Wiring it up

The next step now that we've got Gitea running is to reverse-proxy it with Nginx that we set up earlier.

Create a new file at /etc/nginx/conf.d/2-git.conf, and paste in something like this (not forgetting to customise it to your own use-case):

server {
    listen  80;
    listen  [::]:80;

    server_name git.starbeamrainbowlabs.com;
    return 301 https://$host$request_uri;
}

upstream gitea {
    server  [::1]:3000;
    keepalive 4; # Keep 4 connections open as a cache
}   

server {
    listen  443 ssl http2;
    listen  [::]:443 ssl http2;

    server_name git.starbeamrainbowlabs.com;
    ssl_certificate     /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/git.starbeamrainbowlabs.com-0001/privkey.pem;

    add_header strict-transport-security    "max-age=31536000;";
    add_header access-control-allow-origin  https://nextcloud.starbeamrainbowlabs.com   always;
    add_header content-security-policy      "frame-ancestors http://*.starbeamrainbowlabs.com";

    #index  index.html index.php;
    #root   /srv/www;

    location / {
        proxy_pass          http://gitea;

        #proxy_set_header   x-proxy-server      nginx;
        #proxy_set_header   host                $host;
        #proxy_set_header   x-originating-ip    $remote_addr;
        #proxy_set_header   x-forwarded-for     $remote_addr;

        proxy_hide_header   X-Frame-Options;
    }

    location ~ /.well-known {
        root    /srv/letsencrypt;
    }

    #include /etc/nginx/snippets/letsencrypt.conf;

    #location = / {
    #   proxy_pass          http://127.0.0.1:3000;
    #   proxy_set_header    x-proxy-server      nginx;
    #   proxy_set_header    host                $host;
    #   proxy_set_header    x-originating-ip    $remote_addr;
    #   proxy_set_header    x-forwarded-for     $remote_addr;
    #}

    #location = /favicon.ico {
    #   alias /srv/www/favicon.ico;
    #}
}

You may have to comment out the listen 443 blocks and put in a listen 80 temporarily whilst configuring letsencrypt.

Then, reload Nginx: sudo systemctl reload nginx

Conclusion

Phew! We've looked at installing and setting up Gitea behind Nginx, and using a systemd service to automate the management of Gitea.

I've also talked a bit about how I set my own Gitea instance up and why.

In future posts, I'm going to talk about Continuous Integration, and how I setup Laminar CI. I'll also talk about alternatives for those who want something that comes with a few more batteries included.... :P

Found this interesting? Got stuck and need help? Spotted a mistake? Comment below!

Art by Mythdael