Starbeamrainbowlabs

Stardust
Blog


Archive

Mailing List Articles Atom Feed Comments Atom Feed Twitter

Tag Cloud

3d account algorithms announcement archives arduino artificial intelligence assembly async audio bash batch blog bookmarklet booting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compiling css dailyprogrammer debugging demystification distributed computing downtime embedded systems encryption es6 features event experiment external first impressions future game github github gist graphics hardware hardware meetup holiday html html5 html5 canvas interfaces internet io.js jabber javascript js bin labs learning library linux low level lua maintenance network networking node.js operating systems performance photos php pixelbot portable privacy programming problems project projects prolog protocol protocols pseudo 3d python reddit reference release releases resource review rust secrets security series list server servers software sorting source code control statistics svg technical terminal textures three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control visual web website windows windows 10 xmpp

Let's build a weighted random number generator!

Ever wondered how random loot in a dungeon is generated? Or how the rooms in a procedurally generated castle might be picked? Perhaps you need to skew the number of times an apple is picked by your game engine over a banana. If you've considered any of these things, then you want a weighted random number generator. In this post, I'll be showing you how I built one, and how you can build one too.

If you're interested in trying to build one for yourself first though, then look away now! Come back when you're done (or stuck) to see my solution.

To start with, let's consider what a weighted random number generator actually is. Let's say we've got 3 rewards for a treasure chest: a cool-looking shield, a health potion, and a fancy ring. We want to give the player 1 of the 3 when they option the chest, making sure that the health potion is more common than the others. We can represent that as a ratio: $3 : 4 : 3$.

The introduction image that explains my weighted random number generator.

(Above: The ratio between the different items. See below for the explanation of the math!).

In order to pick one of the 3 items using the ratio, we need to normalise the ratio so that it's between $0$ and $1$. That's rather easy, as far as maths goes: All we have to do is convert each part of the ratio into a fraction, and that into a decimal. Let's calculate the denominator of the fraction first. That's easy-peasy too - we just add up all the parts of the ratio, as we want to represent each part as a fraction of a whole: $3 + 4 + 3 = 10$. With our denominator sorted, we can convert each part into a fraction:

$$ \frac{3}{10} + \frac{4}{10} + \frac{3}{10} = 1 $$

Fractions are nice, but it's be better to have that as a decimal:

$$ 0.3 + 0.4 + 0.3 = 10 $$

That's much better. Now, with the initial theory out of the way, let's start writing a class for it.

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms
{
    public class WeightedRandom<ItemType>
    {
        protected Random rand = new Random();

        protected Dictionary<double, ItemType> weights = new Dictionary<double, ItemType>();

        /// <summary>
        /// Creates a new weighted random number generator.
        /// </summary>
        /// <param name="items">The dictionary of weights and their corresponding items.</param>
        public WeightedRandom(IDictionary<double, ItemType> items)
        {
            if(items.Count == 0)
                throw new ArgumentException("Error: The items dictionary provided is empty!");

            double totalWeight = items.Keys.Aggregate((double a, double b) => a + b);
            foreach(KeyValuePair<double, ItemType> itemData in items)
                weights.Add(itemData.Key / totalWeight, itemData.Value);
        }
    }
}

I've created a template class here, to allow the caller to provide us with any type of item (so long as they are all the same). That's what the <ItemType> bit is on the end of the class name - it's the same syntax behind the List class:

List<TreasureReward> rewards = new List<TreasureReward>() {
    TreasureReward.FromFile("./treasure/coolsword.txt"),
    TreasureReward.FromFile("./treasure/healthpotion.txt"),
    TreasureReward.FromFile("./treasure/fancyring.txt"),
};

Next, let's go through that constructor bit by bit. First, we make sure that we actually have some weights in the first place:

if(items.Count == 0)
    throw new ArgumentException("Error: The items dictionary provided is empty!");

Then, it's more Linq to the rescue in calculating the total of the weights we've been provided with:

double totalWeight = items.Keys.Aggregate((double a, double b) => a + b);

Finally, we loop over each of the items in the provided dictionary, dividing them by the sum of the weights and adding them to our internal dictionary of normalised weights.

foreach(KeyValuePair<double, ItemType> itemData in items)
    weights.Add(itemData.Key / totalWeight, itemData.Value);

Now that we've got our items loaded and the weights normalised, we can start picking things from our dictionary. For this part, I devised a sort of 'sliding window' algorithm to work out which item to pick. It's best explained through a series of whiteboard images:

Basically, I have 2 variables: lower and higher. When I loop over each of the weights, I do the following things:

  1. Add the current normalised weight to higher
  2. Check if the target is between lower and higher a. If it is, then return the current item b. If not, then keep going
  3. Bring lower up to the same value as higher
  4. Loop around again until we find the weight in which the target lies.

With that in mind, here's the code I cooked up:

/// <summary>
/// Picks a new random item from the list provided at initialisation, based
/// on the weights assigned to them.
/// </summary>
/// <returns>A random item, picked according to the assigned weights.</returns>
public ItemType Next()
{
    double target = rand.NextDouble();

    double lower = 0;
    double higher = 0;
    foreach(KeyValuePair<double, ItemType> weightData in weights)
    {
        higher += weightData.Key;
        if(target >= lower && target <= higher)
            return weightData.Value;
        lower += weightData.Key;
    }

    throw new Exception($"Error: Unable to find the weight that matches {target}");
}

That pretty much completes the class. While it seems daunting at first, it's actually quite easy once you get your head around it. Personally, I find whiteboards very useful in that regard! Here's the completed class:

(License: MPL-2.0)

Found this interesting? Got stuck? Have a suggestion for another cool algorithm I could implement? Comment below!

Markov Chains Part 2: Unweighted Chains

Hello and welcome to the second part of this mini-series about markov chains. In the last part, I explained what an n-gram was, and how I went about generating them.

In this part, I'll get to the meat of the subject: The markov chain itself. To start with (to simplify matters) I'll be looking at unweighted markov chains.

A markov chain, in essence, takes the n-grams we generated last time, and picks one to start with. It then takes the all but the first character of the n-gram it chose, and finds all the n-grams in it's library that begin with that sequence of characters. After drawing up a list of suitable n-grams, it picks one at random, and tacks the last character in the n-gram it chose onto the end of the first n-gram.

Then, it starts the whole process all over again with the 2nd n-gram it chose, and then the 3rd, and so on until it either a) hits a brick wall and can't find any suitable n-grams to use next, or b) reaches the desired length of word it was asked to generate.

An unweighted markov chain, as I call it, does not take the frequency of the source n-grams in the original text into account - it just picks the next n-gram from the list randomly.

With explanations and introductions out of the way, let's get down to some code! Since the markov chain is slightly more complicated, I decided to write a class for it. Let's start with one of those, then:

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms.MarkovGrams
{
    /// <summary>
    /// An unweighted character-based markov chain.
    /// </summary>
    public class UnweightedMarkovChain
    {

    }
}

I've also added a few using statements for later. Our new class is looking a bit bare. how about some methods to liven it up a bit?

/// <summary>
/// Creates a new character-based markov chain.
/// </summary>
/// <param name="inNgrams">The ngrams to populate the new markov chain with.</param>
public UnweightedMarkovChain(IEnumerable<string> inNgrams)
{

}

/// <summary>
/// Returns a random ngram that's currently loaded into this UnweightedMarkovChain.
/// </summary>
/// <returns>A random ngram from this UnweightMarkovChain's cache of ngrams.</returns>
public string RandomNgram()
{

}

/// <summary>
/// Generates a new random string from the currently stored ngrams.
/// </summary>
/// <param name="length">
/// The length of ngram to generate.
/// Note that this is a target, not a fixed value - e.g. passing 2 when the n-gram order is 3 will
/// result in a string of length 3. Also, depending on the current ngrams this markov chain contains,
/// it may end up being cut short. 
/// </param>
/// <returns>A new random string.</returns>
public string Generate(int length)
{

}

That's much better. Let's keep going - this time with some member variables:

/// <summary>
/// The random number generator
/// </summary>
Random rand = new Random();

/// <summary>
/// The ngrams that this markov chain currently contains.
/// </summary>
List<string> ngrams;

We'll need that random number generator later! As for the List<string>, we'll be using that to store our n-grams - but you probably figured that one out for yourself :P

The class isn't looking completely bare anymore, but we can still do something about those methods. Let's start with that constructor:

public UnweightedMarkovChain(IEnumerable<string> inNgrams)
{
    ngrams = new List<string>(inNgrams);
}

Easy peasy! It just turns the IEnumerable<string> into a List<string> and stores it. Let's do another one:

public string RandomNgram()
{
    return ngrams[rand.Next(0, ngrams.Count)];
}

We're on a roll here! This is another fairly simple method - it just picks a random n-gram from the dictionary. We'll need this for our 3rd, and most important, method, Generate(). This one's a bit more complicated, so let's take it in a few stages. Firstly, we need an n-gram to start the whole thing off. We also need to return it at the end of the method.

string result = RandomNgram();

return result;

While we're at it, we'll also need a variable to keep track of the last n-gram in the chain, so we can find an appropriate match to come next.

string lastNgram = result;

Then we'll need a loop to keep adding n-grams to the chain. Since we're not entirely sure how long we'll be looping for (and we've got fairly complicated stop conditions, as far as that kind of thing goes), I decided to use a while loop here.

while(result.Length < length)
{

}

That's the first of our 2 stop conditions in place, too! We want to stop when the word we're working on reaches it's desired length. Now, we can write the bit that works out which n-gram should come next! This bit goes inside the while loop we created above (as you might suspect). First, let's fetch a list of n-grams that would actually make sense coming next.

// The substring that the next ngram in the chain needs to start with
string nextStartsWith = lastNgram.Substring(1);
// Get a list of possible n-grams we could choose from next
List<string> nextNgrams = ngrams.FindAll(gram => gram.StartsWith(nextStartsWith));

With a bit of Linq (Language-INtrgrated Query), that isn't too tough :-) If you haven't seen linq before, then I'd highly recommend you check it out! It makes sorting and searching datasets much easier. The above is quite simple - I just filter our list of n-grams through a function that extracts all the ones that start with the appropriate letter.

It's at this point that we can insert the second of our two stopping conditions. If there aren't any possible n-grams to pick from, then we can't continue.

// If there aren't any choices left, we can't exactly keep adding to the new string any more :-(
if(nextNgrams.Count == 0)
    break;

With our list of possible n-grams, we're now in a position to pick one at random to add to the word. It's LINQ to the rescue again:

// Pick a random n-gram from the list
string nextNgram = nextNgrams.ElementAt(rand.Next(0, nextNgrams.Count));

This is another simple one - it just extract the element in the list at a random location in the list. In hindsight I could have used the array operator syntax here ([]), but it doesn't really matter :-)

Now that we've picked the next n-gram, we can add it to the word we're building:

// Add the last character from the n-gram to the string we're building
result += nextNgram[nextNgram.Length - 1];

and that's the markov chain practically done! Oh, we mustn't forget to update the lastNgram variable (I forgot this when building it :P):

lastNgram = nextNgram;

And that wraps up our unweighted markov chain. Here's the whole class in full:

using System;
using System.Collections.Generic;
using System.Linq;

namespace SBRL.Algorithms.MarkovGrams
{
    /// <summary>
    /// An unweighted character-based markov chain.
    /// </summary>
    public class UnweightedMarkovChain
    {
        /// <summary>
        /// The random number generator
        /// </summary>
        Random rand = new Random();

        /// <summary>
        /// The ngrams that this markov chain currently contains.
        /// </summary>
        List<string> ngrams;

        /// <summary>
        /// Creates a new character-based markov chain.
        /// </summary>
        /// <param name="inNgrams">The ngrams to populate the new markov chain with.</param>
        public UnweightedMarkovChain(IEnumerable<string> inNgrams)
        {
            ngrams = new List<string>(inNgrams);
        }

        /// <summary>
        /// Returns a random ngram that's currently loaded into this UnweightedMarkovChain.
        /// </summary>
        /// <returns>A random ngram from this UnweightMarkovChain's cache of ngrams.</returns>
        public string RandomNgram()
        {
            return ngrams[rand.Next(0, ngrams.Count)];
        }

        /// <summary>
        /// Generates a new random string from the currently stored ngrams.
        /// </summary>
        /// <param name="length">
        /// The length of ngram to generate.
        /// Note that this is a target, not a fixed value - e.g. passing 2 when the n-gram order is 3 will
        /// result in a string of length 3. Also, depending on the current ngrams this markov chain contains,
        /// it may end up being cut short. 
        /// </param>
        /// <returns>A new random string.</returns>
        public string Generate(int length)
        {
            string result = RandomNgram();
            string lastNgram = result;
            while(result.Length < length)
            {
                // The substring that the next ngram in the chain needs to start with
                string nextStartsWith = lastNgram.Substring(1);
                // Get a list of possible n-grams we could choose from next
                List<string> nextNgrams = ngrams.FindAll(gram => gram.StartsWith(nextStartsWith));
                // If there aren't any choices left, we can't exactly keep adding to the new string any more :-(
                if(nextNgrams.Count == 0)
                    break;
                // Pick a random n-gram from the list
                string nextNgram = nextNgrams.ElementAt(rand.Next(0, nextNgrams.Count));
                // Add the last character from the n-gram to the string we're building
                result += nextNgram[nextNgram.Length - 1];
                lastNgram = nextNgram;
            }

            return result;
        }
    }
}

I've released the full code for my markov generator (with a complete command line interface!) on my personal git server. The repository can be found here: sbrl/MarkovGrams. To finish this post off, I'll leave you with a few more words that I've generated using it :D

1 2 3 4 5
mecuc uipes jeraq acrin nnvit
blerbopt drsacoqu yphortag roirrcai elurucon
pnsemophiqub omuayplisshi udaisponctec mocaltepraua rcyptheticys
eoigemmmpntartrc rattismemaxthotr hoaxtancurextudu rrgtryseumaqutrc hrpiniglucurutaj

Markov Chains Part 1: N-Grams

After wanting to create a markov chain to generate random words for ages, I've recently had the time to actually write one :D Since I had a lot of fun writing it, I thought I'd share it here.

A markov chain, in simple terms, is an algorithm to take a bunch of input, and generate a virtually unlimited amount of output in the style of the input. If I put my 166 strong wordlist of sciencey words through a markov chain, I get a bunch of words like this:

a b c d
raccession bstrolaneu aticl lonicretiv
mpliquadri tagnetecta subinverti catorp
ssignatten attrotemic surspertiv tecommultr
ndui coiseceivi horinversp icreflerat
landargeog eograuxila omplecessu ginverceng
evertionde chartianua spliqui ydritangt
grajecubst ngintagorp ombintrepe mbithretec
trounicabl ombitagnai ccensorbit holialinai
cessurspec dui mperaneuma yptintivid
ectru llatividet imaccellat siondl
tru coo treptinver gnatiartia
nictrivide pneumagori entansplan uatellonic

Obviously, some of the above aren't particularly readable, but the majority are ok (I could do with a longer input wordlist, I think).

To create our very own markov chain that can output words like the above, we need 2 parts: An n-gram generator, to take in the word list and convert it into a form that we can feed into the second part - the markov chain itself. In this post, I'm going to just look at the n-gram generator - I'll cover the markov chain itself in the second part of this mini-series.

An n-gram is best explained by example. Take the word refractive, for example. Let's split it up into chunks:

ref
efr
fra
rac
act
cti
tiv
ive

See what I've done? I've taken the original word and split it into chunks of 3, but I've only moved along the word by 1 character at a time, so some characters have been duplicated. These are n-grams of order 3. The order, in the case of an n-gram, is the number of characters per chunk. We could use any order we like:

refra
efrac
fract
racti
activ
ctive

The order of the above is 5. If you're wondering how this could possibly be useful - don't worry: All will be explained in due time :-) For now though, writing all these n-grams out manually is rather annoying and tedious. Let's write some code!

Generating n-grams from a single word like we did above is actually pretty simple. Here's what I came up with:

/// <summary>
/// Generates a unique list of n-grams from the given string.
/// </summary>
/// <param name="str">The string to n-gram-ise.</param>
/// <param name="order">The order of n-gram to generate.</param>
/// <returns>A unique list of n-grams found in the specified string.</returns>
public static IEnumerable<string> GenerateFlat(string str, int order)
{
    List<string> results = new List<string>();
    for(int i = 0; i < str.Length - order; i++)
    {
        results.Add(str.Substring(i, order));
    }
    return results.Distinct();
}

I'm using C♯ here, but you can use whatever language you like. Basically, I enter a loop and crawl along the word, adding the n-grams I find to a list, which I then de-duplicate and return.

Generating n-grams for just one word is nice, but we need to process a whole bunch of words. Thankfully, that's easy to automate too with a sneaky overload:

/// <summary>
/// Generates a unique list of n-grams that the given list of words.
/// </summary>
/// <param name="words">The words to turn into n-grams.</param>
/// <param name="order">The order of n-gram to generate..</param>
/// <returns>A unique list of n-grams found in the given list of words.</returns>
public static IEnumerable<string> GenerateFlat(IEnumerable<string> words, int order)
{
    List<string> results = new List<string>();
    foreach(string word in words)
    {
        results.AddRange(GenerateFlat(word, order));
    }
    return results.Distinct();
}

All the above does is take a list of words, run them all through the n-gram generation method we wrote above and return the de-duplicated results. Here's a few that it generated from the same wordlist I used above in order 3:

1 2 3 4 5 6 7 8
hor sig ign gna str tre ren ngt
sol old lde oli sor sou oun tel
lla sub ubs bst tem emp mpe atu
tur err ert thr hre dim ime men
nsi ack cki kin raj aje jec tor
ans nsa sat nsf sfe nsl sla slu
luc uce nsm smi nsp are nsu tan

Next time, I'll show you my (unweighted) markov chain I've written that uses the n-grams generated by these methods.

Sources and Further Reading

PixelBot Part 2: Devices need protocols, apparently

A selection of the technologies I'm using to put together my PixelHub server.

So there I was. I'd just got home, turned on my laptop, opened the Arduino IDE and Monodevelop, and then.... nothing. I knew I wanted my PixelBot to talk to the PixelHub I'd started writing, but I was confused as to how I could make it happen.

In this kind of situation, I realised that although I knew what I wanted them to do, I hadn't figured out the how. As it happens, when you're trying to get one (or more, in this case) different devices to talk to each other, there's something rather useful that helps them all to speak the same language: a protocol.

A protocol is a specification that defines the language that different devices use to talk to each other and exchange messages. Defining one before you start writing a networked program is probably a good idea - I find particularly helpful to write a specification for the protocol that the program(s) I'm writing, especially if their function(s) is/are complicated.

To this end, I've ended up spending a considerable amount of time drawing up the PixelHub Protocol - a specification document that defines how my PixelHub server is going to talk to a swarm of PixelBots. It might seem strange at first, but I decided on a (mostly) binary protocol.

Upon closer inspection though, (I hope) it makes a lot of sense. Since the Arduino is programmed using C++ as it has a limited amount of memory, it doesn't have any of the standard string manipulation function that you're used to in C♯. Since C++ is undoubtedly the harder of the 2 to write, I decided to make it easier to write the C++ rather than the C&sharp. Messages on the Arduino side are come in as a byte[] array, so (in theory) it should be easy to pick out certain known parts of the array and cast them into various different fundamental types.

With the specification written, the next step in my PixelBot journey is to actually implement it, which I'll be posting about in the next entry in this series!

Easier TCP Networking in C♯

I see all sorts of C♯ networking tutorials out there telling you that you have to use byte arrays and buffers and all sorts of other complicated things if you ever want to talk to another machine over the network. Frankly, it's all rather confusing.

Thankfully though, it doesn't have to stay this way. I've learnt a different way of doing TCP networking in C♯ at University (thanks Brian!), and I realised the other day I've never actually written a blog post about it (that I can remember, anyway!). If you know how to read and write files and understand some basic networking concepts (IP addresses, ports, what TCP and UDP are, etc.), you'll have no problems understanding this.

Server

The easiest way to explain it is to demonstrate. Let's build a quick server / client program where the server says hello to the client. Here's the server code:


// Server.cs
using System;
using System.Net;
using System.Net.Sockets;
using System.Threading.Tasks;
using System.IO;

public class Server
{
    public readonly int Port;
    public Server(int inPort)
    {
        Port = inPort; string s;
    }

    public async Task Start()
    {
        TcpListener server = new TcpListener(IPAddress.Any, Port);
        server.Start();
        while (true) {
            TcpClient nextClient = await server.AcceptTcpClientAsync();
            StreamReader incoming = new StreamReader(nextClient.GetStream());
            StreamWriter outgoing = new StreamWriter(nextClient.GetStream()) { AutoFlush = true };

            string name = (await incoming.ReadLineAsync()).Trim();
            await outgoing.WriteLineAsync($"Hello, {name}!");

            Console.WriteLine("Said hello to {0}", name);

            nextClient.Close();
        }
    }
}

// Use it like this in your Main() method:
Server server = new Server(6666);
server.Start().Wait();

Technically speaking, that asynchronous code ought to be running in a separate thread - I've omitted it to make it slightly simpler :-) Let's break this down. The important bit is in the Start() method - the rest is just sugar around it to make it run if you want to copy and paste it. First, we create & start a TcpListener:

TcpListener server = new TcpListener(IPAddress.Any, Port);
server.Start();

Once done, we enter a loop, and wait for the next client:

TcpClient nextClient = await server.AcceptTcpClientAsync();

Now that we have a client to talk to, we attach a StreamReader and a StreamWriter with a special option set on it to allow us to talk to the remote client with ease. The option set on the StreamWriter is AutoFlush, and it basically tells it to flush it's internal buffer every time we write to it - that way things we write to it always hit the TcpClient underneath. Depending on your setup the TcpClient does some internal buffering & optimisations anyway, so we don't need the second layer of buffering here:

StreamReader incoming = new StreamReader(nextClient.GetStream());
StreamWriter outgoing = new StreamWriter(nextClient.GetStream()) { AutoFlush = true };

With that, the rest should be fairly simple to understand:

string name = (await incoming.ReadLineAsync()).Trim();
await outgoing.WriteLineAsync($"Hello, {name}!");

Console.WriteLine("Said hello to {0}", name);

nextClient.Close();

First, we grab the first line that the client sends us, and trim of any whitespace that's lurking around. Then, we send back a friendly hello message to client, before logging what we've done to the console and closing the connection.

Client

Now that you've seen the server code, the client code should be fairly self explanatory. The important lines are highlighted:


using System;
using System.Net;
using System.Net.Sockets;
using System.Threading.Tasks;
using System.IO;

public class Client
{
    public readonly string Hostname;
    public readonly int Port;

    public Client(string inHostname, int inPort)
    {
        Hostname = inHostname; uint a;
        Port = inPort;
    }

    public async Task GetHello(string name) {
        TcpClient client = new TcpClient();
        client.Connect(Hostname, Port);

        StreamReader incoming = new StreamReader(client.GetStream());
        StreamWriter outgoing = new StreamWriter(client.GetStream()) { AutoFlush = true };

        await outgoing.WriteLineAsync(name);

        return (await incoming.ReadLineAsync()).Trim();
    }
}

// Use it like this in your Main() method:
Client client = new Client("localhost", 6666);
Console.Write("Enter your name: ");
Console.WriteLine("The server said: {0}", client.GetHello(Console.ReadLine()).Result);

First, we create a new client and connect it to the server. Next, we connect the StreamReader and StreamWriter instances to the TcpClient, and then we send the name to the server. Finally, we read the response the server sent us and return it. Easy!

Here's some example outputs:

Client:


./NetworkingDemo-Server.exe
Said hello to Bill

Server:


./NetworkingDemo-Client.exe
Enter your name: Bill
The server said: Hello, Bill!

The above code should work on Mac, Windows, and Linux. Granted, it's not the most efficient way of doing things, but it should be fine for most general purposes. Personally, I think the trade-off between performance and readability/ease of understanding of code is totally worth it.

If you prefer, I've got an archive of the above code I wrote for this blog post - complete with binaries. You can find it here: NetworkingDemo.7z.

Understanding your compiler: C#

Sorry for the (very) late post! I fell rather ill on the day before I was going to write the next post, and haven't been well enough to write it until now! Hopefully more cool posts will be on their way soon :-)

A nice picture of Durham Cathedral. (Above: A nice picture of Durham Cathedral. Taken by @euruicimages.)

How many times have you just finished adding a new feature to your latest and greatest program, and hit the "Run" button in Visual Studio or Monodevelop? Have you ever wondered what happens under the hood? Have you ever encountered a strange build error, and just googled the error message in the hope of finding a solution?

In this post, I'm going take you on a journey to understand the build process for a typical C♯ program that happens every time you press Ctrl + Shift + B (or F8 in Monodevelop). I also hope to show you why I think it's important to understand what happens behind the scenes.

The hierarchy of the c♯ build system.

To start any journey, we need a map. I've created a simple diagram for our journey. We'll dive into the world of the project file further down. Lastly, we'll end our journey at the individual source code files that actually make up your project.

Before that though, we need to make a quick stop in your sln file. The sln file (or Solution File) is the place where everything starts. Normally, you only get one solution file for each project you create. It's the file that keeps track of the name of your project, its id, and where all the projects are that your solution references (they're usually in appropriately named subfolders). Note that it doesn't keep references to the other projects that you reference (that's done at the project file level).

Solution files sometimes automatically detected too (xbuild does this for sure). If you're in a command line or terminal, you can build an entire solution with just one command, without having to open Visual Studio or Monodevelop:

cd /path/to/awesome/project
# Windows
msbuild AwesomeSpaceProject.sln
# Linux / Mono
xbuild

With that out of the way, we can talk about project files. Project files define which source code files should be built, and how. You can even add custom triggers that run at any time before, during, or after the build process here (this is how I embedded commit hashes in a C♯ binary)!

Though the syntax is quite simple, much of it is, sadly, un or underdocumented. Thankfully most of it is quite intuitive, and a little experimentation goes a long way:

<Project>
    ....
    <ItemGroup>
        <Reference Include="System" />
        <Reference Include="System.Drawing" />
        ....
    </ItemGroup>
    <ItemGroup>
        <Compile Include="Program.cs" />
        <Compile Include="GameObjects/Spaceship.cs" />
        <Compile Include="GameObjects/Enemy.cs" />
        ....
    </ItemGroup>
    <ItemGroup>
        <EmbeddedResource Include="Resources\Spritesheet.png" />
        <EmbeddedResource Include="Resources\CoolSpace.ttf" />
    </ItemGroup>
    ....
</Project>

The above is a (simplified) example if a project file. It might be called something like CoolSpace.csproj. It references a few core assemblies, specifies a few C♯ files for compiling, and embeds a resource or two.

Of course, these files are generated for you automatically, but it's always helpful to know how to works so not only can you fix it more easily when it goes wrong, but you can also extend it to do extra things that you can't through the user interface, like use wildcards (careful! Too many wildcards can slow down your build) to specify a range of files that you wan to embed without having to go around them all one by one.

Next, let's talk about references. References allow you to pull in code from elsewhere, like a core system library, another project (that's how you link 2 projects in (or even out of!) a solution together), a library of sorts (like Nuget), or another random DLL or EXE (yes, you can reference other executable files) file lying around. With references, yo can spread your code around loads of files and pull in libraries from all over the galaxy and have the build process follow all of your references around and tie everything up into a nice neat package for you.

The next piece of the puzzle is the builder. The software that actually builds your code. On Windows with Visual Studio, it's called msbuild, and is the original build tool, created by Microsoft, that sets the standard. On Linux, there's a different (but very similar) tool called xbuild. xbuild implements the standard that msbuild sets, allowing solutions written on Windows with Visual Studio to be compiled 9 times out of 10 on Linux (and probably Mac too, though I don't have one to check - let me know in the comments if you have one!) without any changes.

The final stone in the bridge, so to speak, is your code itself. The builder (whether it be xbuild or msbuild) preprocesses each included file into an object file, which are then all linked together into the final executable binary, which itself contains common intermediate language (or CIL) which is executed by your operating system (or mono on Linux / Mac). While CIL is a different topic for a separate time, it's still important, so I'm mentioning it here.

If you've made it this far, congratulations! I hope that it made sense. If not (or even if it did!), please leave a comment down below and I'll try to help you out :-)

Developing and Running C# Programs on Linux

Recently I was asked about running C# on Linux, and I remembered that I haven't actually written a blog post on it! This is that blog post I never wrote: A beginner's guide on how to develop, compile, and run C# programs on Linux. Here I assume a debian-based system (specifically Ubuntu 16.04), but it can be just as easily adapted to work with other flavours.

To start off, you'll need to add the mono repository to your system's apt repository list. Taken from the official mono website, here are the commands to do that:

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF
echo "deb http://download.mono-project.com/repo/debian wheezy main" | sudo tee /etc/apt/sources.list.d/mono-xamarin.list
sudo apt update

The more experienced may notice that although I'm installing on a, Ubuntu 16.04 system, I'm still specifying wheezy when installing mono. This shouldn't make too much of a difference - currently mono only supports wheezy upon install.

Next, it's time to actually install mono itself. This is easy:

sudo apt install mono-mcs

The above ought to install the mono runtime and compiler. If you experience issues down the line, simply install the mono-complete package instead.With this installed, you should now be able to launch compiled programs written C♯ by double-clicking them in your file manager - without any modifications. C♯ programs compiled on Windows can be run on Linux, and vice versa (this is because C♯ actually compiled into something called the Common Intermediate Language, or CIL).

If you're on a server, then you can run your programs by prefixing the executable name with mono, like this:

mono program-of-awesomeness.exe

If your program crashes, however, the output is not very helpful at all. Thankfully though there's a way to remedy that. First, make sure your program is compiled in debug mode, and then add the --debug flag when running your program:

mono --debug another-awesome-program.exe

This is all very well, but what about compiling? That's relatively easy as well. mcs is the linux version of csc on Windows, and behaves almost identically with some minor syntactical changes (read up about it by typing mcs --help or man mcs). For Visual Studio solutions, there's xbuild. Xbuild is a new-ish build too for Linux that is capable of compiling almost any Visual Studio solution file without any modifications (though there are some undocumented difficulties that you might run into as an advanced user).

To use it, first install the mono-xbuild package (sudo apt install mono-xbuild), and then, in a terminal, cd into the directory that contains the solution you want to compile, and then type xbuild and hit enter. That's it!

All this work in the terminal is cool for running C♯ programs on GUI-less boxes and servers, but it's no way to develop a larger application. Thankfully, there's a solution to that too! Monodevelop is the best C♯ IDE out there at the moment - it's like Visual Studio for LInux. It's easy to install, too - simply install the monodevelop package (sudo apt install monodevelop).

Monodevelop.

(Above: Monodevelop running on my Linux laptop. The project open here is my sprite packing tool.)

The package in the official repositories should be good enough for general use (though it's probably out of date). For the latest version with all the latest features though, you'll have to compile it from source.

Sadly this is not a trivial process. To do it you need to be comfortable with the terminal and know your way reasonably well around a Linux system. If you still want to go ahead anyway, start by downloading the latest release and follow the instructions. You'll probably find it keeps complaining about things not existing - usually a quick apt search {thing} reveals which package you need to install in order to get it to work. If you have trouble, post a comment below and I'll try to help you out.

Even without compiling monodevelop from source, it's still a pretty good IDE. It lets you create Visual-Studio-compatible solution files, and compile your code on the fly at the touch of a button.

That just about covers the basics of running C♯ on Linux. If there's anything I've missed or you'd like to ask about, post a comment below!

Picking the right interface for multicast communications

A network cabinet

At the recent hardware meetup, I was faced with an interesting problem: I was trying to communicate with my Wemos over multicast UDP in order to get it to automatically discover my PixelHub Server, but the multicast pings were going out over the wrong interface - I had both an ethernet cable plugged in and a WiFi hotspot running on my integrated wireless card.

The solution to this is not as simple you might think - you have to not only pick the right interface, but also the right version of the IP protocol. You also have to have some way to picking the correct interface in the first place.

Let's say you have a big beefy PC with a wireless card and 2 ethernet ports that are (for some magical reason) all in use at the same time, and you want to communicate with another device over your wireless card and not either of your ethernet ports.

I developed this on my linux laptop, but it should work just fine on other OSes.

To start, it's probably a good idea to list all of our network interfaces:

using System.Net.NetworkInformation;

// ...

NetworkInterface[] nics = NetworkInterface.GetAllNetworkInterfaces();
foreach (NetworkInterface nic in nics)
{
    Console.WriteLine("Id: {0} - Description: {1}", nic.Id, nic.Description);
}

This (on my machine at least!) outputs something like this:

eth0 - eth0
lo - lo
wlan0 - wlan0

Your machine will probably output something different. Next, since you can't normally address this list of network interfaces directly by name, we need to write a method to do it for us:

public static NetworkInterface GetNetworkIndexByName4(string targetInterfaceName)
{
    NetworkInterface[] nics = NetworkInterface.GetAllNetworkInterfaces();
    foreach (NetworkInterface nic in nics)
    {
        if (nic.Id == targetInterfaceName)
            return nic;
    }
    throw new Exception($"Error: Can't find network interface with the name {targetInterfaceName}.");
}

Pretty simple, right? We're not out the woods yet though - next we need to tell our UdpClient to talk on a specific network interface. Speaking of which, let's set up that UdpClient so that we can use it to do stuff with multicast:

using System.Net;

// ...

UdpClient client = new UdpClient(5050);
client.JoinMulticastGroup(IPAddress.Parse("239.62.148.30"));

With that out of the way, we can now deal with telling the UcpClient which network interface it should be talking on. This is actually quite tricky, since the UdpClient doesn't take a NetworkInterface directly. Let's define another helper method:

public static int GetIPv4Index(this NetworkInterface nic)
{
    IPInterfaceProperties ipProps = nic.GetIPProperties();
    IPv4InterfaceProperties ip4Props = ipProps.GetIPv4Properties();
    return ip4Props.Index;
}

The above extension method gets the index of the IPv4 interface of the network interface. Since at the moment we are in the middle of a (frustratingly slow) transition from IPv4 and IPv6, each network interface must have both an IPv4 interface, for talking to other IPv4 hosts, and an IPv6 interface for talking to IPv6 hosts. In this example I'm using IPv4, since the Wemos I want to talk to doesn't support IPv6 :-(

Now that we have a way to get the index of a network interface, we need to translate it into something that the UdpClient understands:

int interfaceIndex = (int)IPAddress.HostToNetworkOrder(NetTools.GetNetworkIndexByName4(NetworkInterfaceName));

That complicated! Thankfully, we don't need to pick it apart completely - it just works :-)

Now that we have the interface index in the right format, all we have to do is tell the UdpClient about it. Again, this is also slightly overcomplicated:

beacon.Client.SetSocketOption(
    SocketOptionLevel.IP,
    SocketOptionName.MulticastInterface,
    interfaceIndex
);

Make sure that you put this call before you join the multicast group. With that done, your UdpClient should finally be talking on the right interface!

Whew! That was quite the rabbit hole (I sent my regards to the rabbit :P). If you have any issues with getting it to work, I'm happy to help - just post a comment down below.

A 24 port patch panel.

Sources

  • Stackoverflow answers: 1, 2
  • Openclipart images: 1, 2

Chaikin Curves in C#: An alternative curve generation algorithm

A chaikin curve, increasing in order each frame.

A little while ago I was curious to know if there were any other ways to generate a smooth curve other than with a Bezier Curve. Turns out the answer is yes, and it comes in the form of a Chaikin Curve, which was invented in 1974 by a lecturer in America by the name of George Chaikin. A few days (and a lot of debugging) later, I found myself with a Chaikin curve generator written in pure C♯ (I seem to have this fascination with implementing algorithms :P), so I thought I'd share it here.

Before I do though, I should briefly explain how Chaikin's algorithm actually works. It's actually quite simple. If you have a list of control points, and you were to draw a line through them all, you'd get this:

Some lines before the chaikin algorithm has been run.

The magic of the algorithm happens when you interpolate between your control points. If you build a new list of points that contains points that are ¼ and ¾ along each of the lines between the current control points and draw a line though them instead, then the line suddenly gets a lot smoother. This process can be repeated multiple times to further refine the curve, as is evidenced in the animation above.

This page is very helpful in understanding the algorithm if you're having trouble getting your head around it.

My implementation makes use of the PointF class in the System.Drawing namespace, and also has the ability to generate an SVG version of any generated curve, so that it can be inspected and debugged.

You can find my implementation here: Chaikin Generator - comments and improvements are welcome!

Instructions on how to use to use it are available in the README, and the class is fully documented with Intellisense comments, so it should feel fairly intuitive to use. I've tried to use patterns that are present in the rest of the .NET framework too, so you can probably even guess how to use it correctly.

Additionally, I 'm going to try put it up as a Nuget package, but currently I can't get Nuget to pack it currently on linux (when I do, you can expect a tutorial on here!)

The lost post: Embedding commit hashes in C♯ binaries

I seem to remember that I've started to write this post no less than 3 times, and I've managed to lose the source each and every time. If you're reading this it means that I've managed to complete it this time :D

Imagine you've got a confused client on the phone, asking why the newest feature X in your program doesn't work. You ask them whether they have the latest version of your program installed... and they say that they don't know. Version numbers and changelogs to the rescue! Except.... that last release was rather rushed and you forgot to finish updating the changelog. This is just one scenario in which embedding the latest commit hash into your program is useful.

You could embed the short hash (the first 7 characters) into the version string, for example v3.6.1-375ae31. Then you could compare it against the revision history to see exactly what your codebase looked like when your client's version was built.

For a while now I've wanted to do just this, and now I've finally figured out how to do it cross-platform. This post documents how I went about doing it, and how you can do it too.

The basic principle of the idea is to run a command that will output the latest commit hash to a file before the build starts, and then embed that file into the resulting binary. To achieve this, we need to go about it in 2 parts. Firstly, we need to fiddle with the project file to add an optional pre-build event. Open the project file (MyProject.csproj) in your favourite text editor (but preferably not your favourite IDE such as Visual Studio or MonoDevelop) and add this to the bottom, just before the closing </Project>:

<Target Name="BeforeBuild" BeforeTargets="Build">
    <Exec Command="git rev-parse HEAD &gt;git-hash.txt" WorkingDirectory="$(ProjectDir)" IgnoreExitCode="true" />
</Target>

If you don't use Git, then change git rev-parse HEAD &gt;git-hash.txt in the above to the equivalent command for your version control system. For SVN, this stackoverflow question looks like it'll do the job for Windows - for Linux you should go here.

Once done, the next step is to add the generated file as an embedded resource. We can't do this with the GUI easily here since the file in question hasn't been generated yet! Add the following to the bottom of the csproj file, again just before the </Project>:

<ItemGroup>
    <EmbeddedResource Include="git-hash.txt" />
</ItemGroup>

Remember to change the git-hash.txt to whatever you changed it to above.

Next, save it and reopen the solution in your IDE. The final step is to actually utilise the commit hash (or revision number in SVN) in your program. Since it's just an embedded file, you can simply find it with a bit of reflection, and read it in with a StreamReader. I've written a good tutorial on how to do that over on my Embedding files in C♯ binaries post.

Make sure that your program is prepared to handle junk instead of a commit hash - you can't predict the contents of the embedded file if Git (or SVN) isn't installed on the machine used to build your project. If you want to require that the commit hash (or revision number) is actually present, just remove the IgnoreExitCode="true" from the first snippet above.

Sources

Art by Mythdael