Starbeamrainbowlabs

About

Hello!

I am a computer science student who is in their third year at Hull University. I started out teaching myself about various web technologies, and then I managed to get a place at University, where I am now. I've done a year in industry too, which I found to be particuarly helpful in learning about the workplace and the world.

I currently know C# + Monogame / XNA (+ WPF), HTML5, CSS3, Javascript (ES6 + Node.js), PHP, C / C++ (mainly for Arduino), and a bit of Python. Oh yeah, and I can use XSLT too.

I love to experiment and learn about new things on a regular basis. You can find some of the things that I've done in the labs and code sections of this website, or on GitHub. My current projects are Pepperminty Wiki, an entire wiki engine in a single file (the source code is spread across multiple files - don't worry!), and Nibriboard (a multi-user real-time infinite whiteboard), although the latter is in its very early stages.

I can also be found in a number of other different places around the web. I've compiled a list of the places that I can remember below.

I can be contacted at the email address webmaster at starbeamrainbowlabs dot com. Suggestions, bug reports and constructive criticism are always welcome.

Blog

Blog Roll | Article Atom Feed | Mailing List


Latest Post

Easy AI with Microsoft.Text.Recognizers

I recently discovered that there's an XMPP client library (NuGet) for .NET that I overlooked a few months ago, and so I promptly investigated the building of a bot!

The actual bot itself needs some polishing before I post about it here, but in writing said bot I stumbled across a perfectly brilliant library - released by Microsoft of all companies - that can be used to automatically extract common data-types from a natural-language sentence.

While said library is the underpinnings of the Azure Bot Framework, it's actually free and open-source. To that end, I decided to experiment with it - and ended up writing this blog post.

Data types include (but are not limited to) dates and times (and ranges thereof), numbers, percentages, monetary amounts, email addresses, phone numbers, simple binary choices, and more!

While it also lands you with a terrific number of DLL dependencies in your build output folder, the result is totally worth it! How about pulling a DateTime from this:

in 5 minutes

or this:

the first Monday of January

or even this:

next Monday at half past six

Pretty cool, right? You can even pull multiple things out of the same sentence. For example, from the following:

The host 1.2.3.4 has been down 3 times over the last month - the last of which was from 5pm and lasted 30 minutes

It can extract an IP address (1.2.3.4), a number (3), and a few dates and times (last month, 5pm, 30 minutes).

I've written a test program that shows it in action. Here's a demo of it working:

(Can't see the asciicast above? View it on asciinema.org)

The source code is, of course, available on my personal Git server: Demos/TextRecogniserDemo

If you can't check out the repo, here's the basic gist. First, install the Microsoft.Recognizers.Text package(s) for the types of data that you'd like to recognise. Then, to recognise a date or time, do this:

List<ModelResult> result = DateTimeRecognizer.RecognizeDateTime(nextLine, Culture.English);

The awkward bit is unwinding the ModelResult to get at the actual data. The matched text is stored in the ModelResult.Resolution property, but that's a SortedDictionary<string, object>. The interesting property inside which is value, but depending on the data type you're recognising - that can be an array too! The best way I've found to decipher the data types is to print the value of ModelResult.Resolution as a string to the console:

Console.WriteLine(result[0].Resolution.ToString());

The .NET runtime will helpfully convert this into something like this:

System.Collections.Generic.SortedDictionary`2[System.String,System.Object]

Very helpful. Then we can continue to drill down:

Console.WriteLine(result[0].Resolution["values"]);

This produces this:

System.Collections.Generic.List`1[System.Collections.Generic.Dictionary`2[System.String,System.String]]

Quite a mouthful, right? By cross-referencing this against the JSON (thanks, Newtonsoft.JSON!), we can figure out how to drill the rest of the way. I ended up writing myself a pair of little utility methods for dates and times:

public static DateTime RecogniseDateTime(string source, out string rawString)
{
    List<ModelResult> aiResults = DateTimeRecognizer.RecognizeDateTime(source, Culture.English);
    if (aiResults.Count == 0)
        throw new Exception("Error: Couldn't recognise any dates or times in that source string.");

    /* Example contents of the below dictionary:
        [0]: {[timex, 2018-11-11T06:15]}
        [1]: {[type, datetime]}
        [2]: {[value, 2018-11-11 06:15:00]}
    */

    rawString = aiResults[0].Text;
    Dictionary<string, string> aiResult = unwindResult(aiResults[0]);
    string type = aiResult["type"];
    if (!(new string[] { "datetime", "date", "time", "datetimerange", "daterange", "timerange" }).Contains(type))
        throw new Exception($"Error: An invalid type of {type} was encountered ('datetime' expected).");

    string result = Regex.IsMatch(type, @"range$") ? aiResult["start"] : aiResult["value"];
    return DateTime.Parse(result);
}

private static Dictionary<string, string> unwindResult(ModelResult modelResult)
{
    return (modelResult.Resolution["values"] as List<Dictionary<string, string>>)[0];
}

Of course, it depends on your use-case as to precisely how you unwind it, but the above should be a good starting point.

Once I've polished the bot I've written a bit, I might post about it on here.

Found this interesting? Run into an issue? Got a neat use for it? Comment below!


By on

Labs

Code

Tools

I find useful tools on the internet occasionally. I will list them here.

Art by Mythdael