Starbeamrainbowlabs

Stardust
Blog


Archive

Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d account algorithms announcement architecture archives arduino artificial intelligence artix assembly async audio bash batch blog bookmarklet booting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling css dailyprogrammer debugging demystification distributed computing downtime electronics email embedded systems encryption es6 features event experiment external first impressions future game github github gist graphics hardware hardware meetup holiday html html5 html5 canvas infrastructure interfaces internet io.js jabber javascript js bin labs learning library linux low level lua maintenance manjaro network networking node.js operating systems performance photos php pixelbot portable privacy programming problems project projects prolog protocol protocols pseudo 3d python reddit reference release releases resource review rust secrets security series list server software sorting source code control statistics svg technical terminal textures three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

TeleConsole Client is available on NuGet!

Some cool-looking white circuits on a blue background from the NuGet website. (Above: Some cool-looking circuits that feature on the NuGet website)

Hey! After a large amount of research, I've finally figured out how to create a simple NuGet package. Since I ended up using TeleConsole in a recent ACW and it doesn't have any dependencies (making packaging easier!), I decided to use it to test the system.

I think it's turned out rather well actually - you can find it on NuGet here.

Since it's been such a complicated process, rather than talking about TeleConsole itself, I'd like to give a sort-of tutorial on how I did it instead (if you'd like to read more about TeleConsole, I posted about it when I first released it here).

To start with, you'll need to install NuGet. Once done, the next step is to create a .nuspec file for your project. It goes in the same directory as the .csproj file for the project you want to publish on NuGet. I haven't yet figured out how to reference another project in the same solution and have it work with NuGet, but I should imagine it's mostly automatic. Here's the .nuspec file for TeleConsole:

<?xml version="1.0"?>
<package>
  <metadata>
    <id>TeleConsoleClient</id>
    <version>0.3</version>
    <title>$title$</title>
    <authors>Starbeamrainbowlabs</authors>
    <owners>Starbeamrainbowlabs</owners>
    <licenseUrl>https://github.com/sbrl/TeleConsole/blob/master/LICENSE</licenseUrl>
    <projectUrl>https://github.com/sbrl/TeleConsole/</projectUrl>
    <iconUrl>https://github.com/sbrl/TeleConsole/blob/master/logo.png?raw=true</iconUrl>
    <requireLicenseAcceptance>false</requireLicenseAcceptance>
    <description>$description$</description>
    <releaseNotes>Initial nuget release.</releaseNotes>
    <copyright>Copyright 2017</copyright>
    <tags>Debugging Networking Console Remote</tags>
  </metadata>
</package>

As you can see, it's actually a fairly simple format - based on XML of course, since C♯ seems to love it for some reason. The bits in $ signs are special - they are references to the corresponding values in the .csproj file. You can do it for <version> too - but I was experiencing issues with it not picking this up correctly, so I'm specifying it manually. More information about this can be found on the Microsoft website - links are available at the bottom of this post.

With that done, the next step is to package it into a `.nupkg file 0 which is basically just a renamed .zip file with a specific structure. The nuget command-line application has the ability to do this for us - but doesn't work on Linux without an extra argument (thanks to @ArtOfSettling on GitHub for discovering this!):

nuget pack -MsbuildPath /usr/lib/mono/msbuild/15.0/bin/

...Windows users don't need to include the -MsbuildPath /usr/lib/mono/msbuild/15.0/bin/ bit. This command will output a .nupkg file, which you can then upload to nuget.org here, once you're signed in.

Sources and Further Reading

Getting an updated version of MonoDevelop on Ubuntu

The old and new versions of MonoDevelop overlaid on top of one another.

Thanks to the AUR (The Arch User Repository), an updated version of MonoDevelop is never very far away on Manjaro, Artix, Arch Linux and friends. Unfortunately, such delights can't be found so easily in the Ubuntu and Debian ecosystem. On a check inspection of the official repositories, the latest version available is 5.10, while the AUR is offering 7.1.2!

In addition, the newer versions offered via this 'flatpak' thing are sandboxed, so you can't combined multiple technologies like Node.JS and the amazing npm in your C♯ applications with it.

Thankfully, all is not lost. A clever GitHub user by the name of cra0zy has put together a script that will install the latest version of MonoDevelop for us - without all the hassle of sandboxes or manually compiling from source. Here's a link to this installer:

cra0zy/monodevelop-run-installer

This post is mainly for my own reference, but I'd love to know if it helped you out too. If it did, leave a comment below!

GalleryShare - Share a folder on your computer with a friend

The front page of GalleryShare

Just yesterday, I was browsing my repositories on both my personal git server (git.starbeamrainbowlabs.com) and GitHub, and I stumbled across a program I wrote a while ago and then completely forgot about. It lets you share a directory of files and pictures via http. The picture above is from the wallpapers folder on my laptop here!

On further inspection, I discovered that it didn't require too much work to tidy it up for a release, so I spent an hour or two tidying up a few things, and here is version 0.1! My, it's been far too long since I've blogged about a release of something on here....

If you want to share things yourself, you can download the latest version over here.

In the future, I might add an optional graphical interface to make it even easier for people to use :D

It's actually quite simple. It's powered by the System.Net.HttpServer class (so Windows users will either need to install mono or give it administrative privileges, which is a real shame) since I originally wrote it before I put the GlidingSquirrel together, though it does have it's own routing system of my own devising.

The pages it serves themselves are actually plain XML files, which are rendered with XSLT by the user's browser. This keeps the content that GalleryShare has to dynamically generate simple, and has the added benefit that it can be generated with C&csharp;'s System.Xml.XmlWriter class. It's practically a browser-side templating system, which also has the added benefit of providing an XML-based API for others to consume.

Thumbnails are generated with C♯'s inbuilt System.Drawing image handling functions - I did initially want to use Magick.NET (C♯ bindings for the awesome ImageMagick library) has the System.Drawing classes appear to be a bit funny about the images they'll accept, but Linux support doesn't seem to have landed just yet.

Are you interested in a more in-depth look at how GalleryShare renders thumbnails, or outputs XML? Perhaps the XSLT has caught your eye. Let me know in the comments below!

Further Reading

/r/dailyprogrammer hard challenge #322: Static HTTP 1.0 server

Recently I happened to stumble across the dailyprogrammer subreddit's latest challenge. It was for a static HTTP 1.0 server, and while I built something similar for my networking ACW, I thought I'd give this one a go to create an extendable http server that I can use in other projects. If you want to follow along, you can find the challenge here!

My language of choice, as you might have guessed, was C♯ (I know that C♯ has a HttpServer class inbuilt already, but to listen on 0.0.0.0 on Windows it requires administrative privileges).

It ended up going rather well, actually. In a little less than 24 hours after reading the post, I had myself a working solution, and I thought I'd share here how I built it. Let's start with a class diagram:

A class diagram for the gliding squirrel. (Above: A class diagram for the GlidingSquirrel. Is this diagram better than the last one I drew?)

I'm only showing properties on here, as I'll be showing you the methods attached to each class later. It's a pretty simple design, actually - HttpServer deals with all the core HTTP and networking logic, FileHttpServer handles the file system calls (and can be swapped out for your own class), and HttpRequest, HttpResponse, HttpMethod, HttpResponseCode all store the data parsed out from the raw request coming in, and the data we're about to send back out again.

With a general idea as to how it's put together, lets dive into how it actually works. HttpServer would probably be a good place to start:

public abstract class HttpServer
{
    public static readonly string Version = "0.1-alpha";

    public readonly IPAddress BindAddress;
    public readonly int Port;

    public string BindEndpoint { /* ... */ }

    protected TcpListener server;

    private Mime mimeLookup = new Mime();
    public Dictionary<string, string> MimeTypeOverrides = new Dictionary<string, string>() {
        [".html"] = "text/html"
    };

    public HttpServer(IPAddress inBindAddress, int inPort)
    { /* ... */ }
    public HttpServer(int inPort) : this(IPAddress.IPv6Any, inPort)
    {
    }

    public async Task Start() { /* ... */ }

    public string LookupMimeType(string filePath) { /* ... */ }

    protected async void HandleClientThreadRoot(object transferredClient) { /* ... */ }

    public async Task HandleClient(TcpClient client) { /* ... */ }

    protected abstract Task setup();

    public abstract Task HandleRequest(HttpRequest request, HttpResponse response);
}

(Full version)

It's heavily abbreviated because there's actually quite a bit of code to get through here, but you get the general idea. The start method is the main loop that accepts the TcpClients, and calls HandleClientThreadRoot for each client it accepts. I decided to use the inbuilt ThreadPool class to do the threading for me here:

TcpClient nextClient = await server.AcceptTcpClientAsync();
ThreadPool.QueueUserWorkItem(new WaitCallback(HandleClientThreadRoot), nextClient);

C♯ handles all the thread spawning and killing for me internally this way, which is rather nice. Next, HandleClientThreadRoot sets up a net to catch any errors that are thrown by the next stage (as we're now in a new thread, which can make debugging a nightmare otherwise), and then calls the main HandleClient:

try
{
    await HandleClient(client);
}
catch(Exception error)
{
    Console.WriteLine(error);
}
finally
{
    client.Close();
}

No matter what happens, the client's connection will always get closed. HandleClient is where the magic start to happen. It attaches a StreamReader and a StreamWriter to the client:

StreamReader source = new StreamReader(client.GetStream());
StreamWriter destination = new StreamWriter(client.GetStream()) { AutoFlush = true };

...and calls a static method on HttpRequest to read in and decode the request:

HttpRequest request = await HttpRequest.FromStream(source);
request.ClientAddress = client.Client.RemoteEndPoint as IPEndPoint;

More on that later. With the request decoded, HandleClient hands off the request to the abstract method HandleRequest - but not before setting up a secondary safety net first:

try
{
    await HandleRequest(request, response);
}
catch(Exception error)
{
    response.ResponseCode = new HttpResponseCode(503, "Server Error Occurred");
    await response.SetBody(
        $"An error ocurred whilst serving your request to '{request.Url}'. Details:\n\n" +
        $"{error.ToString()}"
    );
}

This secondary safety net means that we can send a meaningful error message back to the requesting client in the case that the abstract request handler throws an exception for some reason. In the future, I'll probably make this customisable - after all, you don't always want to let the client know exactly what crashed inside the server's internals!

The FileHttpServer class that handles the file system logic is quite simple, actually. The magic is in it's implementation of the abstract HandleRequest method that the HttpServer itself exposes:

public override async Task HandleRequest(HttpRequest request, HttpResponse response)
{
    if(request.Url.Contains(".."))
    {
        response.ResponseCode = HttpResponseCode.BadRequest;
        await response.SetBody("Error the requested path contains dangerous characters.");
        return;
    }

    string filePath = getFilePathFromRequestUrl(request.Url);
    if(!File.Exists(filePath))
    {
        response.ResponseCode = HttpResponseCode.NotFound;
        await response.SetBody($"Error: The file path '{request.Url}' could not be found.\n");
        return;
    }

    FileInfo requestFileStat = null;
    try {
        requestFileStat = new FileInfo(filePath);
    }
    catch(UnauthorizedAccessException error) {
        response.ResponseCode = HttpResponseCode.Forbidden;
        await response.SetBody(
            "Unfortunately, the server was unable to access the file requested.\n" + 
            "Details:\n\n" + 
            error.ToString() + 
            "\n"
        );
        return;
    }

    response.Headers.Add("content-type", LookupMimeType(filePath));
    response.Headers.Add("content-length", requestFileStat.Length.ToString());

    if(request.Method == HttpMethod.GET)
    {
        response.Body = new StreamReader(filePath);
    }
}

With all the helper methods and properties on HttpResponse, it's much shorter than it would otherwise be! Let's go through it step by step.

if(request.Url.Contains(".."))

This first step is a quick check for anything obvious that could be used against the server to break out of the web root. There are probably other dangerous things you can do(or try to do, anyway!) to a web server to attempt to trick it into returning arbitrary files, but I can't think of any of the top of my head that aren't covered further down. If you can, let me know in the comments!

string filePath = getFilePathFromRequestUrl(request.Url);

Next, we translate the raw path received in the request into a path to a file on disk. Let's take a look inside that method:

protected string getFilePathFromRequestUrl(string requestUrl)
{
    return $"{WebRoot}{requestUrl}";
}

It's rather simplistic, I know. I can't help but feel that there's something I missed here.... Let me know if you can think of anything. (If you're interested about the dollar syntax there - it's called an interpolated string, and is new in C♯ 6! Fancy name, I know. Check it out!)

if(!File.Exists(filePath))
{
    response.ResponseCode = HttpResponseCode.NotFound;
    await response.SetBody($"Error: The file path '{request.Url}' could not be found.\n");
    return;
}

Another obvious check. Can't have the server crashing every time it runs into a 404! A somewhat interesting note here: File.Exists only checks to see if there's a file that exists under the specified path. To check for the existence of a directory, you have to use Directory.Exists - which would make directory listing rather easy to implement. I might actually try that later - with an option to turn it off, of course.

FileInfo requestFileStat = null;
try {
    requestFileStat = new FileInfo(filePath);
}
catch(UnauthorizedAccessException error) {
    response.ResponseCode = HttpResponseCode.Forbidden;
    await response.SetBody(
        "Unfortunately, the server was unable to access the file requested.\n" + 
        "Details:\n\n" + 
        error.ToString() + 
        "\n"
    );
    return;
}

Ok, on to something that might be a bit more unfamiliar. The FileInfo class can be used to get, unsurprisingly, information about a file. You can get all sorts of statistics about a file or directory with it, such as the last modified time, whether it's read-only from the perspective of the current user, etc. We're only interested in the size of the file though for the next few lines:

response.Headers.Add("content-type", LookupMimeType(filePath));
response.Headers.Add("content-length", requestFileStat.Length.ToString());

These headers are important, as you might expect. Browsers to tend to like to know the type of content they are receiving - and especially it's size.

if(request.Method == HttpMethod.GET)
{
    response.Body = new StreamReader(filePath);
}

Lastly, we send the file's contents back to the user in the response - but only if it's a GET request. This rather neatly takes care of HEAD requests - but might cause issues elsewhere. I'll probably end up changing it if it does become an issue.

Anyway, now that we've covered everything right up to sending the response back to the client, let's end our tour with a look at the request parsing system. It's a bit backwards, but it does seem to work in an odd sort of way! It all starts in HttpRequest.FromStream.

public static async Task<HttpRequest> FromStream(StreamReader source)
{
    HttpRequest request = new HttpRequest();

    // Parse the first line
    string firstLine = await source.ReadLineAsync();
    var firstLineData = ParseFirstLine(firstLine);

    request.HttpVersion = firstLineData.httpVersion;
    request.Method = firstLineData.requestMethod;
    request.Url = firstLineData.requestPath;

    // Extract the headers
    List<string> rawHeaders = new List<string>();
    string nextLine;
    while((nextLine = source.ReadLine()).Length > 0)
        rawHeaders.Add(nextLine);

    request.Headers = ParseHeaders(rawHeaders);

    // Store the source stream as the request body now that we've extracts the headers
    request.Body = source;

    return request;
}

It looks deceptively simple at first glance. To start with, I read in the first line, extract everything useful from it, and attach them to a new request object. Then, I read in all the headers I can find, parse those too, and attach them to the request object we're building.

Finally, I attach the StreamReader to the request itself, as it's now pointing at the body of the request from the user. I haven't actually tested this, as I don't actually use it anywhere just yet, but it's a nice reminder just in case I do end up needing it :-)

Now, let's take a look at the cream on the cake - the method that parses the first line of the incoming request. I'm quite pleased with this actually, as it's my first time using a brand new feature of C♯:

public static (float httpVersion, HttpMethod requestMethod, string requestPath) ParseFirstLine(string firstLine)
{
    List<string> lineParts = new List<string>(firstLine.Split(' '));

    float httpVersion = float.Parse(lineParts.Last().Split('/')[1]);
    HttpMethod httpMethod = MethodFromString(lineParts.First());

    lineParts.RemoveAt(0); lineParts.RemoveAt(lineParts.Count - 1);
    string requestUrl = lineParts.Aggregate((string one, string two) => $"{one} {two}");

    return (
        httpVersion,
        httpMethod,
        requestUrl
    );
}

Monodevelop, my C♯ IDE, appears to go absolutely nuts over this with red squiggly lines everywhere, but it still compiles just fine :D

As I was writing this, a thought popped into my head that a tuple would be perfect here. After reading somewhere a month or two ago about a new tuple syntax that's coming to C♯ I thought I'd get awesomely distracted and take a look before continuing, and what I found was really cool. In C♯ 7 (the latest and quite possibly greatest version of C♯ to come yet!), there's a new feature called value tuples, which let's you dynamically declare tuples like I have above. They're already fully supported by the C♯ compiler, so you can use them today! Just try to ignore your editor if it gets as confused as mine did... :P

If you're interested in learning more about them, I'll leave a few links at the bottom of this post. Anyway, back to the GlidingSquirrel! Other than the new value tuples in the above, there's not much going on, actually. A few linq calls take care of the heavy lifting quite nicely.

And finally, here's my header parsing method.

public static Dictionary<string, string> ParseHeaders(List<string> rawHeaders)
{
    Dictionary<string, string> result = new Dictionary<string, string>();

    foreach(string header in rawHeaders)
    {
        string[] parts = header.Split(':');
        KeyValuePair<string, string> nextHeader = new KeyValuePair<string, string>(
            parts[0].Trim().ToLower(),
            parts[1].Trim()
        );
        if(result.ContainsKey(nextHeader.Key))
            result[nextHeader.Key] = $"{result[nextHeader.Key]},{nextHeader.Value}";
        else
            result[nextHeader.Key] = nextHeader.Value;
    }

    return result;
}

While I have attempted to build in support for multiple definitions of the same header according to the spec, I haven't actually encountered a time when it's actually been needed. Again, this is one of those things I've built in now for later - as I do intend on updating this and adding more features later - and perhaps even work it into another secret project I might post about soon.

Lastly, I'll leave you with a link to the repository I'm storing the code for the GlidingSquirrel, and a few links for your enjoyment:

GlidingSquirrel

Sources and Further Reading

Weekend Challenge: Detecting and Decoding Morse Code in an Audio File

A banner full of morse code. The depicted audio processing program in the background is ocenaudio. Recently I received a message in morse code from a family member using this site. It said that the sender had hidden the message, so I was presented with 2 options: I could sit and decode the message by listening to it over and over again, or write a program to do it for me.

Naturally, as a computer science student and enthusiast, I chose the second option. My first problem: Capture a recording of the target morse code. This was easy - the audio-recorder package in the ubuntu repositories solved that one easily, as it has an option to record the audio output of my laptop.

Second problem: Figure out how to read the recording in and extract the samples in C♯. This one wasn't so easy. Amidst issues with flatpak and Monodevelop (flatpak is terrible!), I eventually found the NAudio (Codeplex, GitHub, NuGet) package to do the job. After some digging, I discovered that NAudio is actually really powerful! It's got some pretty advanced functions for handling audio that I'll have to explore at a later date.

Anyway, with a plan of action I set to work. - decided to work in reverse, so the first thing I needed was a chart that converted morse code into the latin alphabet. Wikipedia to the rescue:

A chart showing how to convert morse code into the latin alphabet.

(Source, Direct link, My Mirror)

With a handy-dandy conversion chart, it was relatively simple to create a class to handle the conversion from dots and dashes to the latin alphabet automatically:

using System;
using System.Collections.Generic;

namespace SBRL.Algorithms.MorseCodeTranslator
{
    /// <summary>
    /// A simple class to translate a morse code string into a normal string.
    /// </summary>
    /// <license>Mozilla Public License version 2.0</license>
    /// <origin></origin>
    /// <author>Starbeamrainbowlabs (https://starbeamrainbowlabs.com/)</author>
    /// <changelog>
    /// v0.1 - 26th May 2017:
    ///      - Creation! 😁
    /// </changelog>
    public static class MorseDecoder
    {
        /// <summary>
        /// The morse code lookup table. Use the methods in this class is possible,
        /// rather than accessing this lookup table directly!
        /// </summary>
        public static Dictionary<string, char> morseCodeLookup = new Dictionary<string, char>()
        {
            [".-"] = 'a',
            ["-..."] = 'b',
            ["-.-."] = 'c',
            ["-.."] = 'd',
            ["."] = 'e',
            ["..-."] = 'f',
            ["--."] = 'g',
            ["...."] = 'h',
            [".."] = 'i',
            [".---"] = 'j',
            ["-.-"] = 'k',
            [".-.."] = 'l',
            ["--"] = 'm',
            ["-."] = 'n',
            ["---"] = 'o',
            [".--."] = 'p',
            ["--.-"] = 'q',
            [".-."] = 'r',
            ["..."] = 's',
            ["-"] = 't',
            ["..-"] = 'u',
            ["...-"] = 'v',
            [".--"] = 'w',
            ["-..-"] = 'x',
            ["-.--"] = 'y',
            ["--.."] = 'z',
            [".----"] = '1',
            ["..---"] = '2',
            ["...--"] = '3',
            ["....-"] = '4',
            ["....."] = '5',
            ["-...."] = '6',
            ["--..."] = '7',
            ["---.."] = '8',
            ["----."] = '9',
            ["-----"] = '0',
        };

        /// <summary>
        /// Translates a single letter from morse code.
        /// </summary>
        /// <param name="morseSource">The morse code to translate.</param>
        /// <returns>The translated letter.</returns>
        public static char TranslateLetter(string morseSource)
        {
            return morseCodeLookup[morseSource.Trim()];
        }

        /// <summary>
        /// Translates a string of space-separated morse code strings from morse code.
        /// </summary>
        /// <param name="morseSource">The morse code to translate.</param>
        /// <returns>The translated word.</returns>
        public static string TranslateWord(string morseSource)
        {
            string result = string.Empty;

            string[] morseLetters = morseSource.Split(" ".ToCharArray());

            foreach(string morseLetter in morseLetters)
                result += TranslateLetter(morseLetter);

            return result;
        }

        /// <summary>
        /// Translates a list of morse-encoded words.
        /// </summary>
        /// <param name="morseSources">The morse-encoded words to decipher.</param>
        /// <returns>The decoded text.</returns>
        public static string TranslateText(IEnumerable<string> morseSources)
        {
            string result = string.Empty;
            foreach(string morseSource in morseSources)
                result += $"{TranslateWord(morseSource)} ";
            return result.Trim();
        }
    }
}

That was easy! The next challenge to tackle was considerably more challenging though: Read in the audio file and analyse the samples. I came up with that I think is a rather ingenious design. It's best explained with a diagram:

The algorithm I created to decode an morse code signal from an audio file.

  1. Read the raw samples into a buffer. If there isn't enough space to hold it all at once, then we handle it in chunks.
  2. Move a sliding-window along the raw buffer, with a width of 100 samples and sliding along 25 samples at a time. Extracts the maximum value from the window each time and places it in the windowed buffer.
  3. Analyse the windowed buffer and extract context-free tokens that mark the start or end of a tone.
  4. Convert the context-free tokens into ones that hold the starting point and length of the tones.
  5. Analyse the contextual tokens to extract the morse code as a string
  6. Decipher the morse code string

It's a pretty complicated problem when you first think about it, but breaking it down into steps as I did in the above diagram really helps in figuring out how you're going to tackle it. I, however, ended up drawing the diagram after Id finished writing the program.... I appear to find it easy to break things down in my head - it's only when it gets too big to remember all at once or if I'm working with someone else that I draw diagrams :P

Having drawn up an algorithm and 6 steps I needed to follow to create the program, I spent a happy afternoon writing some C♯. While the remainder of the algorithm is not too long (only ~202 lines), it's a bit too long to explain bit by bit here. I have uploaded the full program to a repository on my personal git server, which you can find here: sbrl/AudioMorseDecoder.

If you're confused about any part of it, ask away in the comments below! Binaries available on request.

I'll leave you with a pair of challenging messages of my own to decode. Try not to use my decoder - write your own!

Message A (easy), Message B (hard) (hard message generated with cwwav)

Let's build a weighted random number generator!

Ever wondered how random loot in a dungeon is generated? Or how the rooms in a procedurally generated castle might be picked? Perhaps you need to skew the number of times an apple is picked by your game engine over a banana. If you've considered any of these things, then you want a weighted random number generator. In this post, I'll be showing you how I built one, and how you can build one too.

If you're interested in trying to build one for yourself first though, then look away now! Come back when you're done (or stuck) to see my solution.

To start with, let's consider what a weighted random number generator actually is. Let's say we've got 3 rewards for a treasure chest: a cool-looking shield, a health potion, and a fancy ring. We want to give the player 1 of the 3 when they option the chest, making sure that the health potion is more common than the others. We can represent that as a ratio: $3 : 4 : 3$.

The introduction image that explains my weighted random number generator.

(Above: The ratio between the different items. See below for the explanation of the math!).

In order to pick one of the 3 items using the ratio, we need to normalise the ratio so that it's between $0$ and $1$. That's rather easy, as far as maths goes: All we have to do is convert each part of the ratio into a fraction, and that into a decimal. Let's calculate the denominator of the fraction first. That's easy-peasy too - we just add up all the parts of the ratio, as we want to represent each part as a fraction of a whole: $3 + 4 + 3 = 10$. With our denominator sorted, we can convert each part into a fraction:

$$ \frac{3}{10} + \frac{4}{10} + \frac{3}{10} = 1 $$

Fractions are nice, but it's be better to have that as a decimal:

$$ 0.3 + 0.4 + 0.3 = 10 $$

That's much better. Now, with the initial theory out of the way, let's start writing a class for it.

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms
{
    public class WeightedRandom<ItemType>
    {
        protected Random rand = new Random();

        protected Dictionary<double, ItemType> weights = new Dictionary<double, ItemType>();

        /// <summary>
        /// Creates a new weighted random number generator.
        /// </summary>
        /// <param name="items">The dictionary of weights and their corresponding items.</param>
        public WeightedRandom(IDictionary<double, ItemType> items)
        {
            if(items.Count == 0)
                throw new ArgumentException("Error: The items dictionary provided is empty!");

            double totalWeight = items.Keys.Aggregate((double a, double b) => a + b);
            foreach(KeyValuePair<double, ItemType> itemData in items)
                weights.Add(itemData.Key / totalWeight, itemData.Value);
        }
    }
}

I've created a template class here, to allow the caller to provide us with any type of item (so long as they are all the same). That's what the <ItemType> bit is on the end of the class name - it's the same syntax behind the List class:

List<TreasureReward> rewards = new List<TreasureReward>() {
    TreasureReward.FromFile("./treasure/coolsword.txt"),
    TreasureReward.FromFile("./treasure/healthpotion.txt"),
    TreasureReward.FromFile("./treasure/fancyring.txt"),
};

Next, let's go through that constructor bit by bit. First, we make sure that we actually have some weights in the first place:

if(items.Count == 0)
    throw new ArgumentException("Error: The items dictionary provided is empty!");

Then, it's more Linq to the rescue in calculating the total of the weights we've been provided with:

double totalWeight = items.Keys.Aggregate((double a, double b) => a + b);

Finally, we loop over each of the items in the provided dictionary, dividing them by the sum of the weights and adding them to our internal dictionary of normalised weights.

foreach(KeyValuePair<double, ItemType> itemData in items)
    weights.Add(itemData.Key / totalWeight, itemData.Value);

Now that we've got our items loaded and the weights normalised, we can start picking things from our dictionary. For this part, I devised a sort of 'sliding window' algorithm to work out which item to pick. It's best explained through a series of whiteboard images:

Basically, I have 2 variables: lower and higher. When I loop over each of the weights, I do the following things:

  1. Add the current normalised weight to higher
  2. Check if the target is between lower and higher a. If it is, then return the current item b. If not, then keep going
  3. Bring lower up to the same value as higher
  4. Loop around again until we find the weight in which the target lies.

With that in mind, here's the code I cooked up:

/// <summary>
/// Picks a new random item from the list provided at initialisation, based
/// on the weights assigned to them.
/// </summary>
/// <returns>A random item, picked according to the assigned weights.</returns>
public ItemType Next()
{
    double target = rand.NextDouble();

    double lower = 0;
    double higher = 0;
    foreach(KeyValuePair<double, ItemType> weightData in weights)
    {
        higher += weightData.Key;
        if(target >= lower && target <= higher)
            return weightData.Value;
        lower += weightData.Key;
    }

    throw new Exception($"Error: Unable to find the weight that matches {target}");
}

That pretty much completes the class. While it seems daunting at first, it's actually quite easy once you get your head around it. Personally, I find whiteboards very useful in that regard! Here's the completed class:

(License: MPL-2.0)

Found this interesting? Got stuck? Have a suggestion for another cool algorithm I could implement? Comment below!

Markov Chains Part 2: Unweighted Chains

Hello and welcome to the second part of this mini-series about markov chains. In the last part, I explained what an n-gram was, and how I went about generating them.

In this part, I'll get to the meat of the subject: The markov chain itself. To start with (to simplify matters) I'll be looking at unweighted markov chains.

A markov chain, in essence, takes the n-grams we generated last time, and picks one to start with. It then takes the all but the first character of the n-gram it chose, and finds all the n-grams in it's library that begin with that sequence of characters. After drawing up a list of suitable n-grams, it picks one at random, and tacks the last character in the n-gram it chose onto the end of the first n-gram.

Then, it starts the whole process all over again with the 2nd n-gram it chose, and then the 3rd, and so on until it either a) hits a brick wall and can't find any suitable n-grams to use next, or b) reaches the desired length of word it was asked to generate.

An unweighted markov chain, as I call it, does not take the frequency of the source n-grams in the original text into account - it just picks the next n-gram from the list randomly.

With explanations and introductions out of the way, let's get down to some code! Since the markov chain is slightly more complicated, I decided to write a class for it. Let's start with one of those, then:

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms.MarkovGrams
{
    /// <summary>
    /// An unweighted character-based markov chain.
    /// </summary>
    public class UnweightedMarkovChain
    {

    }
}

I've also added a few using statements for later. Our new class is looking a bit bare. how about some methods to liven it up a bit?

/// <summary>
/// Creates a new character-based markov chain.
/// </summary>
/// <param name="inNgrams">The ngrams to populate the new markov chain with.</param>
public UnweightedMarkovChain(IEnumerable<string> inNgrams)
{

}

/// <summary>
/// Returns a random ngram that's currently loaded into this UnweightedMarkovChain.
/// </summary>
/// <returns>A random ngram from this UnweightMarkovChain's cache of ngrams.</returns>
public string RandomNgram()
{

}

/// <summary>
/// Generates a new random string from the currently stored ngrams.
/// </summary>
/// <param name="length">
/// The length of ngram to generate.
/// Note that this is a target, not a fixed value - e.g. passing 2 when the n-gram order is 3 will
/// result in a string of length 3. Also, depending on the current ngrams this markov chain contains,
/// it may end up being cut short. 
/// </param>
/// <returns>A new random string.</returns>
public string Generate(int length)
{

}

That's much better. Let's keep going - this time with some member variables:

/// <summary>
/// The random number generator
/// </summary>
Random rand = new Random();

/// <summary>
/// The ngrams that this markov chain currently contains.
/// </summary>
List<string> ngrams;

We'll need that random number generator later! As for the List<string>, we'll be using that to store our n-grams - but you probably figured that one out for yourself :P

The class isn't looking completely bare anymore, but we can still do something about those methods. Let's start with that constructor:

public UnweightedMarkovChain(IEnumerable<string> inNgrams)
{
    ngrams = new List<string>(inNgrams);
}

Easy peasy! It just turns the IEnumerable<string> into a List<string> and stores it. Let's do another one:

public string RandomNgram()
{
    return ngrams[rand.Next(0, ngrams.Count)];
}

We're on a roll here! This is another fairly simple method - it just picks a random n-gram from the dictionary. We'll need this for our 3rd, and most important, method, Generate(). This one's a bit more complicated, so let's take it in a few stages. Firstly, we need an n-gram to start the whole thing off. We also need to return it at the end of the method.

string result = RandomNgram();

return result;

While we're at it, we'll also need a variable to keep track of the last n-gram in the chain, so we can find an appropriate match to come next.

string lastNgram = result;

Then we'll need a loop to keep adding n-grams to the chain. Since we're not entirely sure how long we'll be looping for (and we've got fairly complicated stop conditions, as far as that kind of thing goes), I decided to use a while loop here.

while(result.Length < length)
{

}

That's the first of our 2 stop conditions in place, too! We want to stop when the word we're working on reaches it's desired length. Now, we can write the bit that works out which n-gram should come next! This bit goes inside the while loop we created above (as you might suspect). First, let's fetch a list of n-grams that would actually make sense coming next.

// The substring that the next ngram in the chain needs to start with
string nextStartsWith = lastNgram.Substring(1);
// Get a list of possible n-grams we could choose from next
List<string> nextNgrams = ngrams.FindAll(gram => gram.StartsWith(nextStartsWith));

With a bit of Linq (Language-INtrgrated Query), that isn't too tough :-) If you haven't seen linq before, then I'd highly recommend you check it out! It makes sorting and searching datasets much easier. The above is quite simple - I just filter our list of n-grams through a function that extracts all the ones that start with the appropriate letter.

It's at this point that we can insert the second of our two stopping conditions. If there aren't any possible n-grams to pick from, then we can't continue.

// If there aren't any choices left, we can't exactly keep adding to the new string any more :-(
if(nextNgrams.Count == 0)
    break;

With our list of possible n-grams, we're now in a position to pick one at random to add to the word. It's LINQ to the rescue again:

// Pick a random n-gram from the list
string nextNgram = nextNgrams.ElementAt(rand.Next(0, nextNgrams.Count));

This is another simple one - it just extract the element in the list at a random location in the list. In hindsight I could have used the array operator syntax here ([]), but it doesn't really matter :-)

Now that we've picked the next n-gram, we can add it to the word we're building:

// Add the last character from the n-gram to the string we're building
result += nextNgram[nextNgram.Length - 1];

and that's the markov chain practically done! Oh, we mustn't forget to update the lastNgram variable (I forgot this when building it :P):

lastNgram = nextNgram;

And that wraps up our unweighted markov chain. Here's the whole class in full:

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms.MarkovGrams
{
    /// <summary>
    /// An unweighted character-based markov chain.
    /// </summary>
    public class UnweightedMarkovChain
    {
        /// <summary>
        /// The random number generator
        /// </summary>
        Random rand = new Random();

        /// <summary>
        /// The ngrams that this markov chain currently contains.
        /// </summary>
        List<string> ngrams;

        /// <summary>
        /// Creates a new character-based markov chain.
        /// </summary>
        /// <param name="inNgrams">The ngrams to populate the new markov chain with.</param>
        public UnweightedMarkovChain(IEnumerable<string> inNgrams)
        {
            ngrams = new List<string>(inNgrams);
        }

        /// <summary>
        /// Returns a random ngram that's currently loaded into this UnweightedMarkovChain.
        /// </summary>
        /// <returns>A random ngram from this UnweightMarkovChain's cache of ngrams.</returns>
        public string RandomNgram()
        {
            return ngrams[rand.Next(0, ngrams.Count)];
        }

        /// <summary>
        /// Generates a new random string from the currently stored ngrams.
        /// </summary>
        /// <param name="length">
        /// The length of ngram to generate.
        /// Note that this is a target, not a fixed value - e.g. passing 2 when the n-gram order is 3 will
        /// result in a string of length 3. Also, depending on the current ngrams this markov chain contains,
        /// it may end up being cut short. 
        /// </param>
        /// <returns>A new random string.</returns>
        public string Generate(int length)
        {
            string result = RandomNgram();
            string lastNgram = result;
            while(result.Length < length)
            {
                // The substring that the next ngram in the chain needs to start with
                string nextStartsWith = lastNgram.Substring(1);
                // Get a list of possible n-grams we could choose from next
                List<string> nextNgrams = ngrams.FindAll(gram => gram.StartsWith(nextStartsWith));
                // If there aren't any choices left, we can't exactly keep adding to the new string any more :-(
                if(nextNgrams.Count == 0)
                    break;
                // Pick a random n-gram from the list
                string nextNgram = nextNgrams.ElementAt(rand.Next(0, nextNgrams.Count));
                // Add the last character from the n-gram to the string we're building
                result += nextNgram[nextNgram.Length - 1];
                lastNgram = nextNgram;
            }

            return result;
        }
    }
}

I've released the full code for my markov generator (with a complete command line interface!) on my personal git server. The repository can be found here: sbrl/MarkovGrams. To finish this post off, I'll leave you with a few more words that I've generated using it :D

1 2 3 4 5
mecuc uipes jeraq acrin nnvit
blerbopt drsacoqu yphortag roirrcai elurucon
pnsemophiqub omuayplisshi udaisponctec mocaltepraua rcyptheticys
eoigemmmpntartrc rattismemaxthotr hoaxtancurextudu rrgtryseumaqutrc hrpiniglucurutaj

Markov Chains Part 1: N-Grams

After wanting to create a markov chain to generate random words for ages, I've recently had the time to actually write one :D Since I had a lot of fun writing it, I thought I'd share it here.

A markov chain, in simple terms, is an algorithm to take a bunch of input, and generate a virtually unlimited amount of output in the style of the input. If I put my 166 strong wordlist of sciencey words through a markov chain, I get a bunch of words like this:

a b c d
raccession bstrolaneu aticl lonicretiv
mpliquadri tagnetecta subinverti catorp
ssignatten attrotemic surspertiv tecommultr
ndui coiseceivi horinversp icreflerat
landargeog eograuxila omplecessu ginverceng
evertionde chartianua spliqui ydritangt
grajecubst ngintagorp ombintrepe mbithretec
trounicabl ombitagnai ccensorbit holialinai
cessurspec dui mperaneuma yptintivid
ectru llatividet imaccellat siondl
tru coo treptinver gnatiartia
nictrivide pneumagori entansplan uatellonic

Obviously, some of the above aren't particularly readable, but the majority are ok (I could do with a longer input wordlist, I think).

To create our very own markov chain that can output words like the above, we need 2 parts: An n-gram generator, to take in the word list and convert it into a form that we can feed into the second part - the markov chain itself. In this post, I'm going to just look at the n-gram generator - I'll cover the markov chain itself in the second part of this mini-series.

An n-gram is best explained by example. Take the word refractive, for example. Let's split it up into chunks:

ref
efr
fra
rac
act
cti
tiv
ive

See what I've done? I've taken the original word and split it into chunks of 3, but I've only moved along the word by 1 character at a time, so some characters have been duplicated. These are n-grams of order 3. The order, in the case of an n-gram, is the number of characters per chunk. We could use any order we like:

refra
efrac
fract
racti
activ
ctive

The order of the above is 5. If you're wondering how this could possibly be useful - don't worry: All will be explained in due time :-) For now though, writing all these n-grams out manually is rather annoying and tedious. Let's write some code!

Generating n-grams from a single word like we did above is actually pretty simple. Here's what I came up with:

/// <summary>
/// Generates a unique list of n-grams from the given string.
/// </summary>
/// <param name="str">The string to n-gram-ise.</param>
/// <param name="order">The order of n-gram to generate.</param>
/// <returns>A unique list of n-grams found in the specified string.</returns>
public static IEnumerable<string> GenerateFlat(string str, int order)
{
    List<string> results = new List<string>();
    for(int i = 0; i < str.Length - order; i++)
    {
        results.Add(str.Substring(i, order));
    }
    return results.Distinct();
}

I'm using C♯ here, but you can use whatever language you like. Basically, I enter a loop and crawl along the word, adding the n-grams I find to a list, which I then de-duplicate and return.

Generating n-grams for just one word is nice, but we need to process a whole bunch of words. Thankfully, that's easy to automate too with a sneaky overload:

/// <summary>
/// Generates a unique list of n-grams that the given list of words.
/// </summary>
/// <param name="words">The words to turn into n-grams.</param>
/// <param name="order">The order of n-gram to generate..</param>
/// <returns>A unique list of n-grams found in the given list of words.</returns>
public static IEnumerable<string> GenerateFlat(IEnumerable<string> words, int order)
{
    List<string> results = new List<string>();
    foreach(string word in words)
    {
        results.AddRange(GenerateFlat(word, order));
    }
    return results.Distinct();
}

All the above does is take a list of words, run them all through the n-gram generation method we wrote above and return the de-duplicated results. Here's a few that it generated from the same wordlist I used above in order 3:

1 2 3 4 5 6 7 8
hor sig ign gna str tre ren ngt
sol old lde oli sor sou oun tel
lla sub ubs bst tem emp mpe atu
tur err ert thr hre dim ime men
nsi ack cki kin raj aje jec tor
ans nsa sat nsf sfe nsl sla slu
luc uce nsm smi nsp are nsu tan

Next time, I'll show you my (unweighted) markov chain I've written that uses the n-grams generated by these methods.

Sources and Further Reading

PixelBot Part 2: Devices need protocols, apparently

A selection of the technologies I'm using to put together my PixelHub server.

So there I was. I'd just got home, turned on my laptop, opened the Arduino IDE and Monodevelop, and then.... nothing. I knew I wanted my PixelBot to talk to the PixelHub I'd started writing, but I was confused as to how I could make it happen.

In this kind of situation, I realised that although I knew what I wanted them to do, I hadn't figured out the how. As it happens, when you're trying to get one (or more, in this case) different devices to talk to each other, there's something rather useful that helps them all to speak the same language: a protocol.

A protocol is a specification that defines the language that different devices use to talk to each other and exchange messages. Defining one before you start writing a networked program is probably a good idea - I find particularly helpful to write a specification for the protocol that the program(s) I'm writing, especially if their function(s) is/are complicated.

To this end, I've ended up spending a considerable amount of time drawing up the PixelHub Protocol - a specification document that defines how my PixelHub server is going to talk to a swarm of PixelBots. It might seem strange at first, but I decided on a (mostly) binary protocol.

Upon closer inspection though, (I hope) it makes a lot of sense. Since the Arduino is programmed using C++ as it has a limited amount of memory, it doesn't have any of the standard string manipulation function that you're used to in C♯. Since C++ is undoubtedly the harder of the 2 to write, I decided to make it easier to write the C++ rather than the C&sharp. Messages on the Arduino side are come in as a byte[] array, so (in theory) it should be easy to pick out certain known parts of the array and cast them into various different fundamental types.

With the specification written, the next step in my PixelBot journey is to actually implement it, which I'll be posting about in the next entry in this series!

Easier TCP Networking in C♯

I see all sorts of C♯ networking tutorials out there telling you that you have to use byte arrays and buffers and all sorts of other complicated things if you ever want to talk to another machine over the network. Frankly, it's all rather confusing.

Thankfully though, it doesn't have to stay this way. I've learnt a different way of doing TCP networking in C♯ at University (thanks Brian!), and I realised the other day I've never actually written a blog post about it (that I can remember, anyway!). If you know how to read and write files and understand some basic networking concepts (IP addresses, ports, what TCP and UDP are, etc.), you'll have no problems understanding this.

Server

The easiest way to explain it is to demonstrate. Let's build a quick server / client program where the server says hello to the client. Here's the server code:


// Server.cs
using System;
using System.Net;
using System.Net.Sockets;
using System.Threading.Tasks;
using System.IO;

public class Server
{
    public readonly int Port;
    public Server(int inPort)
    {
        Port = inPort; string s;
    }

    public async Task Start()
    {
        TcpListener server = new TcpListener(IPAddress.Any, Port);
        server.Start();
        while (true) {
            TcpClient nextClient = await server.AcceptTcpClientAsync();
            StreamReader incoming = new StreamReader(nextClient.GetStream());
            StreamWriter outgoing = new StreamWriter(nextClient.GetStream()) { AutoFlush = true };

            string name = (await incoming.ReadLineAsync()).Trim();
            await outgoing.WriteLineAsync($"Hello, {name}!");

            Console.WriteLine("Said hello to {0}", name);

            nextClient.Close();
        }
    }
}

// Use it like this in your Main() method:
Server server = new Server(6666);
server.Start().Wait();

Technically speaking, that asynchronous code ought to be running in a separate thread - I've omitted it to make it slightly simpler :-) Let's break this down. The important bit is in the Start() method - the rest is just sugar around it to make it run if you want to copy and paste it. First, we create & start a TcpListener:

TcpListener server = new TcpListener(IPAddress.Any, Port);
server.Start();

Once done, we enter a loop, and wait for the next client:

TcpClient nextClient = await server.AcceptTcpClientAsync();

Now that we have a client to talk to, we attach a StreamReader and a StreamWriter with a special option set on it to allow us to talk to the remote client with ease. The option set on the StreamWriter is AutoFlush, and it basically tells it to flush it's internal buffer every time we write to it - that way things we write to it always hit the TcpClient underneath. Depending on your setup the TcpClient does some internal buffering & optimisations anyway, so we don't need the second layer of buffering here:

StreamReader incoming = new StreamReader(nextClient.GetStream());
StreamWriter outgoing = new StreamWriter(nextClient.GetStream()) { AutoFlush = true };

With that, the rest should be fairly simple to understand:

string name = (await incoming.ReadLineAsync()).Trim();
await outgoing.WriteLineAsync($"Hello, {name}!");

Console.WriteLine("Said hello to {0}", name);

nextClient.Close();

First, we grab the first line that the client sends us, and trim of any whitespace that's lurking around. Then, we send back a friendly hello message to client, before logging what we've done to the console and closing the connection.

Client

Now that you've seen the server code, the client code should be fairly self explanatory. The important lines are highlighted:


using System;
using System.Net;
using System.Net.Sockets;
using System.Threading.Tasks;
using System.IO;

public class Client
{
    public readonly string Hostname;
    public readonly int Port;

    public Client(string inHostname, int inPort)
    {
        Hostname = inHostname; uint a;
        Port = inPort;
    }

    public async Task GetHello(string name) {
        TcpClient client = new TcpClient();
        client.Connect(Hostname, Port);

        StreamReader incoming = new StreamReader(client.GetStream());
        StreamWriter outgoing = new StreamWriter(client.GetStream()) { AutoFlush = true };

        await outgoing.WriteLineAsync(name);

        return (await incoming.ReadLineAsync()).Trim();
    }
}

// Use it like this in your Main() method:
Client client = new Client("localhost", 6666);
Console.Write("Enter your name: ");
Console.WriteLine("The server said: {0}", client.GetHello(Console.ReadLine()).Result);

First, we create a new client and connect it to the server. Next, we connect the StreamReader and StreamWriter instances to the TcpClient, and then we send the name to the server. Finally, we read the response the server sent us and return it. Easy!

Here's some example outputs:

Client:


./NetworkingDemo-Server.exe
Said hello to Bill

Server:


./NetworkingDemo-Client.exe
Enter your name: Bill
The server said: Hello, Bill!

The above code should work on Mac, Windows, and Linux. Granted, it's not the most efficient way of doing things, but it should be fine for most general purposes. Personally, I think the trade-off between performance and readability/ease of understanding of code is totally worth it.

If you prefer, I've got an archive of the above code I wrote for this blog post - complete with binaries. You can find it here: NetworkingDemo.7z.

Art by Mythdael