Stardust | Starbeamrainbowlabs

On the value of the open source community

Open source is a wonderful thing. With over 200 million repositories on GitHub alone and many many more of SourceHut, GitLab, and thousands of personal git server instances (like mine!) across the globe, there's no question that open source powers the world - look no further than NASA's Curiosity rover!

On a more personal level, open source means a lot to me too, and I wanted to talk a bit about that here. I think the oldest open source project I both started and am continuing to work on and improve would have to be Pepperminty Wiki. At 1.8K commits and the first commit way back in November 2014 (7 years 7 months ago, wow), I've probably poured thousands of hours into it and many other projects over the years.

While at the time it was just because it was something cool I wanted to work on, since then it's come to mean far more to me, and has helped me to develop very useful skills without me even realising it, and I can thoroughly recommend if you have some time to spare and you're beginning a journey into programming / computer science, it's definitely worth your time.

Documenting things. I really can't stress this one enough. Through working on open source projects, I've learnt the power of good documentation. You can write the best program in the world, but if it isn't documented well then nobody will know how to use it! The best test for documentation is when someone comes along and tries your program out without you present, as then in following your documentation they test it ensuring it's updated and accurate.

Writing good documentation is both an iterative process and takes practice but open source is a great place to do so. You don't even have to have written the program yourself - you can even help out another project and improve theirs. Chances are you've read the documentation for a free and open source program already - and if you've ever had a thought of how you'd improve it, don't hesitate to get involved!

Working (remotely) in a team. Speaking of, it's very common to collaborate in open source with people over the Internet - probably in different timezones to you - on everything from tracking down bugs to writing code to reviewing contributions. Doing so effectively can also be a learned skill, but one definitely worth having.

Having a portfolio for your CV. Personally this isn't something I really think about myself, but if you're looking to get a job of some description, then having an open source portfolio of work can definitely work in your favour, and demonstrate your skills to your potential new employer.

Reviewing contributions, resolving disputes. What you review and how often you do so depends greatly on the kind of project you're working on. For me, this is not something I do often on my personal projects, but I do it all the time in tldr-pages.

By reviewing and checking for issues and things other people might have missed, the quality of a project can be improved. It's a fine balance though between getting contributions merged and requesting improvements. On the one hand, suggesting improvements can be a good thing as previously described, but on the other it can cause unnecessary delays and frustrate everyone involved. As with all things on this list, it takes practice and continual adjustments to find the right balance.

Helping others. One of the things I love most about open source is how I can help other people out. tldr-pages has 39.1K stars as of the time of typing, and is a hugely useful resource that lots of people use daily. I've had comments from people thanking me for my work on Pepperminty Wiki and other projects that I've created and open-sourced. Knowing that I'm helping others out is very motivating for me to continue contributing :-)

Conclusion

All these things are reasons why I'm proud to say that I'm a part of the open source community as a maintainer of multiple open source projects (both personal and tldr-pages). I'm especially grateful to everyone at tldr-pages (especially @waldyrious) for everything they taught me, and the chance I've been given to help out with the project.

While it hasn't been easy at times (helping to maintain a popular project like tldr-pages takes time and can be tedious in places), it's certainly something I'll be continuing to do and can thoroughly recommend to anyone who has the time to do so.

C# & .NET Terminology Demystified: A Glossary

After my last glossary post on LoRa, I thought I'd write another one of C♯ and .NET, as (in typical Microsoft fashion it would seem), they're seems to be a lot of jargon floating around whose meaning is not always obvious.

If you're new to C♯ and the .NET ecosystems, I wouldn't recommend tackling all of this at once - especially the bottom ~3 definitions - with those in particular there's a lot to get your head around.

C♯

C♯ is an object-oriented programming language that was invented by Microsoft. It's cross-platform, and is usually written in an IDE (Integrated Development Environment), which has a deeper understanding of the code you write than a regular text editor. IDEs include Visual Studio (for Windows) and MonoDevelop (for everyone else).

Solution

A Solution (sometimes referred to as a Visual Studio Solution) is the top-level definition of a project, contained in a file ending in .sln. Each solution may contain one or more Project Files (not to be confused with the project you're working on itself), each of which gets compiled into a single binary. Each project may have its own dependencies too: whether they be a core standard library, another project, or a NuGet package.

Project

A project contains your code, and sits 1 level down from a solution file. Normally, a solution file will sit in the root directory of your repository, and the projects will each have their own sub-folders.

While each project has a single output file (be that a .dll class library or a standalone .exe executable), a project may have multiple dependencies - leading to many files in the build output folder.

The build process and dependency definitions for a project are defined in the .csproj file. This file is written in XML, and can be edited to perform advanced build steps, should you need to do something that the GUI of your IDE doesn't support. I've blogged about the structuring of this file before (see here, and also a bit more here), should you find yourself curious.

CIL

Known as Common Intermediate Language, CIL is the binary format that C♯ (also Visual Basic and F♯ code) code gets compiled into. From here, the .NET runtime (on Windows) or Mono (on macOS, Linux, etc.) can execute it to run the compiled project.

MSBuild

The build system for Solutions and Projects. It reads a .sln or .csproj (there are others for different languages, but I won't list them here) file and executes the defined build instructions.

.NET Framework

The .NET Framework is the standard library of C♯ it provides practically everything you'll need to perform most common tasks. It does not provide a framework for constructing GUIs and Graphical Interfaces. You can browse the API reference over at the official .NET API Browser.

WPF

The Windows Presentation Foundation is a Windows-only GUI framework. Powered by XAML (eXtensible Application Markup Language) definitions of what the GUI should look like, it provides everything you need to create a native-looking GUI on Windows.

It does not work on macOS and Linux. To create a cross-platform program that works on all 3 operating systems, you'll need to use an alternative GUI framework, such as XWT or Gtk# (also: Glade). A more complete list of cross-platform frameworks can be found here. It's worth noting that Windows Forms, although a tempting option, aren't as flexible as the other options listed here.

C♯ 7

The 7th version of the C♯ language specification. This includes the syntax of the language, but not the .NET Framework itself.

.NET Standard

A specification of the .NET Framework, but not the C♯ Language. As of the time of typing, the latest version is 2.0, although version 1.6 is commonly used too. The intention here is the improve cross-platform portability of .NET programs by defining a specification for a subset of the full .NET Framework standard library that all platforms will always be able to use. This includes Android and iOS through the use of Xamarin.

Note that all .NET Standard projects are class libraries. In order to create an executable, you'll have to add an additional Project to your Solution that references your .NET Standard class library.

ASP.NET

A web framework for .NET-based programming languages (in our case C♯). Allows you to write C♯ code to handle HTTP (and now WebSockets) requests in a similar manner to PHP, but different in that your code still needs compiling. Compiled code is then managed by a web server IIS web server (on Windows).

With the release of .NET Core, ASP.NET is now obsolete.

.NET Core

Coming in 2 versions so far (1.0 and 2.0), .NET Core is the replacement for ASP.NET (though this is not its exclusive purpose). As far as I understand it, .NET Core is a modular runtime that allows programs targeting it to run multiple platforms. Such programs can either be ASP.NET Core, or a Universal Windows Platform application for the Windows Store.

This question and answer appears to have the best explanation I've found so far. In particular, the displayed diagram is very helpful:

A diagram showing the structure of the .NET ecosystem, including .NET Core. See the links in the sources and further reading below for more information.

....along with the pair of official "Introducing" blog posts that I've included in the Sources and Further Reading section below.

Conclusion

We've looked at some of the confusing terminology in the .NET ecosystems, and examined each of them in turn. We started by defining and untangling the process by which your C♯ code is compiled and run, and then moved on to the different variants and specifications related to the .NET Framework and C♯.

As always, this is a starting point - not an ending point! I'd recommend doing some additional reading and experimentation to figure out all the details.

Found this helpful? Still confused? Spotted a mistake? Comment below!

Sources and Further Reading

Demystifying Inverted Indexes

(The magnifying glass in the above banner came from openclipart)

After writing the post that will be released after this one, I realised that I made a critical assumption that everyone knew what an inverted index was. Upon looking for an appropriate tutorial online, I couldn't find one that was close enough to what I did in Pepperminty Wiki, so I decided to write my own.

First, some context. What's Pepperminty Wiki? Well, it's a complete wiki engine in a single file of PHP. The source files are obviously not a single file, but it builds into a single file - making it easy to drop into any PHP-enabled web server.

One of its features is a full-text search engine. A personal wiki of mine has ~75k words spread across ~550 pages, and it manages to search them all in just ~450ms! It does this with the aid of an inverted index - which I'll be explaining in this post.

First though, we need some data to index. How about the descriptions of some video games?

Kerbal Space Program

In KSP, you must build a space-worthy craft, capable of flying its crew out into space, without killing them. At your disposal is a collection of parts, which must be assembled to create a functional ship. Each part has its own function and will affect the way a ship flies (or doesn't). So strap yourself in, and get ready to try some Rocket Science!

Cross Code

Meet Lea as she logs into an MMO of the distant future. Follow her steps as she discovers a vast world, meets other players and overcomes all the challenges of the game.

Fort Meow

Fort Meow is a physics game by Upper Class Walrus about a girl, an old book and a house full of cats! Meow.

Factory Balls

Factory balls is the brand new bonte game I already announced yesterday. Factory balls takes part in the game design competition over at jayisgames. The goal of the design competition was to create a 'ball physics'-themed game. I hope you enjoy it!

Very cool, this should provide us with plenty of data to experiment with. Firstly, let's consider indexing. Take the Factory Balls description. We can split it up into tokens like this:

T	o	k	e	n	s	V	V
factory	balls	is	the	brand	new	bonte	game
i	already	announced	yesterday	factory	balls	takes
part	in	the	game	design	competition	over
at	jayisgames	the	goal	of	the	design
competition	was	to	create	a	ball	physics
themed	game	i	hope	you	enjoy	it

Notice how we've removed punctuation here, and made everything lowercase. This is important for the next step, as we want to make sure we consider Factory and factory to be the same word - otherwise when querying the index we'd have to remember to get the casing correct.

With our tokens sorted, we can now count them to create our index. It's like a sort of tally chart I guess, except we'll be including the offset in the text of every token in the list. We'll also be removing some of the most common words in the list that aren't very interesting - these are known as stop words. Here's an index generated from that Factory Balls text above:

Token	Frequency	Offsets
factory	2	0, 12
balls	2	1, 13
brand	1	4
new	1	5
bonte	1	6
game	3	7, 18, 37
i	2	8, 38
announced	1	10
yesterday	1	11
takes	1	14
design	2	19, 28
competition	2	20, 29
jayisgames	1	23
goal	1	25
create	1	32
ball	1	34
physics	1	35
themed	1	36
hope	1	39
enjoy	1	41

Very cool. Now we can generate an index for each page's content. The next step is to turn this into an inverted index. Basically, the difference between the normal index and a inverted index is that an entry in an inverted index contains not just the offsets for a single page, but all the pages that contain that token. For example, the Cross-Code example above also contains the token game, so the inverted index entry for game would contain a list of offsets for both the Factory Balls and Cross-Code pages.

Including the names of every page under every different token in the inverted index would be both inefficient computationally and cause the index to grow rather large, so we should assign each page a unique numerical id. Let's do that now:

Id	Page Name
1	Kerbal Space Program
2	Cross Code
3	Fort Meow
4	Factory Balls

There - much better. In Pepperminty Wiki, this is handled by the ids class, which has a pair of public methods: getid($pagename) and getpagename($id). If an id can't be found for a page name, then a new id is created and added to the list (Pepperminty Wiki calls this the id index) transparently. Similarly, if a page name can't be found for an id, then null should be returned.

Now that we've got ids for our pages, let's look at generating that inverted index entry for game we talked about above. Here it is:

Term: game

Id	Frequency	Offsets
2	1	31
3	1	5
4	3	5, 12, 23

Note how there isn't an entry for page id 1, as the Kerbal Space Program page doesn't contain the token game.

This, in essence, is the basics of inverted indexes. A full inverted index will contain an entry for every token that's found in at least 1 source document - though the approach used here is far from the only way of doing it (I'm sure there are much more advanced ways of doing it for larger datasets, but this came to mind from reading a few web articles and is fairly straight-forward and easy to understand).

Can you write a program that generates a full inverted index like I did in the example above? Try testing it on the test game descriptions at the start of this post.

You may also have noticed that the offsets used here are of the tokens in the list. If you wanted to generate contexts (like Duck Duck Go or Google do just below the title of a result), you'd need to use the character offsets from the source document instead. Can you extend your program to support querying the inverted index, generating contexts based on the inverted index too?

Liked this post? Got your own thoughts on the subject? Having trouble with the challenges at the end? Comment below!

An epic journey awaits: The hows and whys of DNS (and why DNS privacy is important)

The fancy 1.1.1.1 logo! Read on to find out more. (Above: The logo of Cloudflare's new announcement. Read on to find out more! Sourced from here.)

Hello! I hope everyone had a nice restful Easter. Cloudflare made an exciting announcement recently (more on that later), which inspired me to sit down and write about a vital, but invisible, part of the internet we know today.

It's called DNS (Domain Name System), and I'd like to take you on a journey - showing you what DNS is, how it works, how it can be exploited, and what we can do about it. After all, privacy is important! How does relate to DNS you ask? Well, I'll show you - but we're getting a little ahead of ourselves. Let's introduce DNS first. I'll explain what it is, how it works, and why we need it.

Enter Stage Left

DNS is, in many ways, the backbone of the modern internet. While it isn't directly responsible for delivering billions of packets across the internet every day like the Internet Protocol is, its role is still vitally important. DNS is responsible for translating domain names, such as starbeamrainbowlabs.com, bobsrockets.com, or billsboosters.net into an IP address that your device can connect to in order to do whatever else it needs to do.

It does this by sending a UDP datagram (comparison with TCP) to a DNS server ask it for a specific type of response - usually the IP address associated with a specific domain name. The following query types are most common:

A - Returns the IPv4 address(es) associated with the specified domain name
AAAA - Same as A, but returns IPv6 addresses instead
CNAME - Acts as an alias to another domain name. Usually immediately followed by either an A or AAAA record in the DNS server's response to save time (a DNS server can return multiple items in a single response)
MX - A bit like a CNAME, but returns a prioritised list of domains that handle email for the specified domain.
TXT - Contains an arbitrary text string. Usually used for easter eggs or for domain ownership verification by various analytics services (e.g. Google Analytics, Bing Webmaster Tools, etc.)
NS - Specifies which DNS servers can be queried about the domain.
SOA - Specifies what the primary DNS server is that holds the authoritative copy of the DNS records for the specified domain.

Let's try it out

With that in mind, lets try some queries.

(Can't see the above asciicast? Try viewing it over on asciinema.org, or entering the below commands into a computer with DiG)

dig starbeamrainbowlabs.com
dig bbc.co.uk AAAA
dig cloudflare.com MX
dig contact.starbeamrainbowlabs.com TXT
dig github.com TXT

DiG is a command-line DNS client for Linux-like operating systems (if you don't have it already, try sudo apt install dnsutils, or equivalent for your distribution. If you're on Windows without access to a Linux-like machine, try following along with nslookup.). In the above asciicast I make a variety of queries for demonstrative purposes. Note the QUESTION SECTION and ANSWER SECTION bits - they tell us what the query was for, and what the response to that query was. For example, here's an extract from the question and answer sections respectively from the bbc.co.uk lookup in the asciicast:

;bbc.co.uk.         IN  AAAA

bbc.co.uk.      300 IN  AAAA    2a04:4e42:600::81

The bit in the question section is quite straightforward - it's asking for an AAAA record for bbc.co.uk.. The answer section is a bit more complicated. From left to right:

bbc.co.uk. - The domain name the response is for.
300 - The time-to-live. In other words, the number of seconds that the response can be cached for.
IN - a legacy component. Stands for INternet - more information here
AAAA - The type of response record.
2a04:4e42:600::81 - The IPv6 address that the domain name corresponds to.

Am I being spied on?

DNS works rather well, most of the time. The problems start to occur when you start thinking about privacy. With more websites than ever now serving their websites over https, the data that we transfer between these websites and our devices is now much more secure - and can't be intercepted, analysed, and modified in transit.

DNS, however, is not currently encrypted - which poses a rather serious problem. Anyone able to get a hold of your devices network traffic - such as another device of your network in promiscuous mode, your ISP, or literally anyone in between you and your DNS server - can spy on the DNS lookups your device is doing, and even poison your DNS cache - sending you to an attacker's website when you typed in a legitimate domain name!

DNS Cache Poisoning in Action

(Above: A DNS timing cache poisoning in action. The attacker responds with a spoofed UDP datagram before the original server has a chance to reply!)

Thankfully, after 35 years of DNS, the internet has some solutions to some of these problems. First up: DNSSEC. Often misunderstood, the protocol tries to prevent man-in-the-middle and timing attacks (such as the one shown in the diagram above) by cryptographically verifying the DNS records returned to the client. Though it's actually 20 years old already, it's still overly-complicated - and subsequently hasn't been rolled out by an awful lot of people. It's also rather weighty - requiring the transfer of crytographical keys and other associated information.

Preventing cache poisoning is one thing, but it would be nice to prevent nosy onlookers from peering at the DNS queries we're making - and here's where it gets complicated. As of early 2018, there are currently no less than 3 competing standards to provide proper client-server connection encryption:

DNS-over-HTTPS - Basically a protocol for sending DNS requests via a standard HTTPS web server. As you can imagine, this can be rather weighty.
DNS-over-TLS - As the name implies - DNS queries over a raw TLS connection - which is, in short, a HTTPS connection without the HTTP bit. Now supported natively in Android.
DNSCurve - An augmentation to the existing DNS protocol that adds encryption by way of elliptical curves. The supposed official website appears to be a bit biased and inaccurate, so I'm linking to the Wikipedia article here.

A bit of mess, isn't it? Furthermore, many applications don't yet have support for some (or any) of these protocols. In that regard, it's currently a waiting game. Still, it's interesting to compare the different approaches taken here. Most of these protocols carry significantly more weight that plain-old DNS - with DNS-over-HTTPS being the most weighty, and DNSCurve being the lightest I should imagine.

I find it especially curious that DNS-over-HTTPS is as popular as it is. Surely it's a bit flawed if you've got to look up the domain name of the HTTPS server that you need to contact in order to do a 'secure' lookup? A safe is only as strong as it's weakest point, after all....

But wait, there's more!

Encrypted and verified responses are all very well, but it's no good of the owner of the DNS server themselves are logging all the queries you send to them! Google's 8.8.8.8 service logs a percentage of queries made permanently to disk, and OpenDNS don't appear to have very many details on their website about what data they collect and what they don't!

Furthermore, some DNS servers (especially those controlled by ISPs) tend to have some domain names censored due to agreements with their country's government - preventing you from 'accessing' a website by stopping your device from figuring out where on the internet to talk to.

Clearly, these are serious issues - and the solutions boil down to trust. Who do you trust to send your DNS queries to? If you don't trust any of the aforementioned providers (Google Public DNS or OpenDNS), then you could always run a DNS resolver yourself.

How does it work if you run it yourself? Well, basically instead of your device sending queries to a remote DNS server, they send it to your personal DNS server instead. Your personal DNS server then performs a recursive resolve. Basically, this means that it traverses the requested domain name from right-to-left, analysing and resolving each part in turn. For example, gateway.discord.gg. would be resolved like so:

com.
discord.
gateway.

For each successive part of the domain name, the DNS server asks the next one in the the chain that other DNS servers hold the authoritative records about that domain name (using SOA / NS records), and then repeats the cycle with the servers provided in the response.

Quite quickly you can see that there's an issue here - how does it know where to start? That's where root servers come in. They contain the authoritative information on the Internet's top-level domains. These servers can be queried to figure out which servers hold information about the various country codes (or other codes). The servers that these root servers point to can then be queried to ask who holds information about the various domain names you and I are used to typing in our address bars, such as seanssatellites.io, or billsboosters.edu for instance.

A simpler alternative

This brings me to the announcement by Cloudflare I mentioned at the beginning of this post. By now, you can probably guess what it is: they've set up a new public DNS server! Apparently, they did a deal with [APNIC]() to let them study the garbage traffic that ends up at 1.1.1.1 in exchange for running a DNS server on it.

Either way, I think it's a brilliant thing for the Internet at large to have another public DNS network to choose from. Especially considering how privacy-conscious they appear to have being in setting it up: They never store client IP addresses, and they delete the anonymised logs after 24 hours. Assuming what they've said is true, I think that it's rather great. For my own personal reference, here are the IP addresses of Cloudflare's new service:

1.1.1.1
1.0.0.1
2606:4700:4700::1111
2606:4700:4700::1001

Conclusion

That brings us to the end of our journey through DNS. We've seen what DNS is and how it works. We've also seen how it can be attacked, and what is being done about it. Lastly, we've taken a look at how running your own recursive resolver works, and looked at Cloudflare's new service.

If you'd like to continue on and explore DNS further, I've left some links below.

Found this informative? Still confused about something? Comment below!

Sources and Further Reading

Transform your javascript with Browserify

Tired of battling endless <script> tags in your html files? Fed up with messing with a dozen libraries cluttering up the place? Can't see the wood from the trees? Try browserify (+ rollupify + wzrd)! It's amazing! It's awesome! It tidies up your code for you, so you don't have to (perhaps not :P)!

Seriously though, I've just been playing around with browserify, and it's awesome. It's that missing thing I've been trying to find for a long time. But what does it actually do, you ask?

Well, perhaps it's best to use an example. Consider these (relatively) harmless javascript files:

// SillySay.js
"use strict";

function sillySay(sentence) {
    // Split the sentence up into words
    var words = splitWords(sentence);

    // Loop over all the words in the above array and display them one by one
    for(let i in words) {
        alert(words[i]);
    }
}

// WordSplitter.js
"use strict";
function splitWords(sentence) {
    // Split the sentence on whitespace and return the resulting array
    return sentence.split(/\s+/g);
}

To use our (perfectly ridiculous) example code, we not only have to include SillySay.js, but WordSplitter.js (this could be a library you use for example) as well:

<!DOCTYPE html>
<html>
    <head>
        <meta charset='utf-8' />
        <title>Silly Say Demo</title>
    </head>
    <body>
        <p>Silly Say Demo</p>
        <p>By <a href="https://starbeamrainbowlabs.com/">Starbeamrainbowlabs</a></p>

        <!---------------->
        <script src="WordSplitter.js"></script>
        <script src="SillySay.js" charset="utf-8"></script>
        <script>
            window.addEventListener("load", function(event) {
                sillySay("This is a test");
            });
        </script>

        <style>
            html, body { font-size: 100%; }
            body
            {
                font-family: sans-serif;
            }
        </style>
    </head>
</html>

That's looking a bit messy, but imagine what it'd be like if you added another few libraries? Or a new feature in a separate file? See the problem? Browserify solves just this issue. It analyses the dependencies of the entry point to your app, and bundles up all your code into a single file, nice and neat. You can add extra transforms (like plugins), too, to do extra things like automatically insert your app's version, or include other data files automatically, or transpile other languages to javascript automagically (full list here).

Sounds cool yet? Let me give you a quick tutorial on how I set up Browserify, with Rollupify and Wzrd.

Firstly, we need to set things up. If you don't have Node.js installed, do that now. You'll also get npm - Node's (perfectly awesome!) package manager. Next, let's create a quick project and paste in the code above. I've recorded an asciicast (as you may have seen a few times before here) of me going through the process:

(Can't see the asciicast above? Try viewing it here

If you'd like to take a look at the final result, as written in the asciicast above, you can find it over here. Questions and comments are welcome below :-)

Achievement Get: Upgrade Server - A writeup of my experiences

An open and a closed box.

The upgrade is complete! I've managed to move practically everything over to the new server, apart from a few automated cleanup tasks here and there. There are still a few issues floating around, but they shouldn't affect this website, and I should have them cleared up soon. Although the migration went smoother than I expected, I did encounter some issues and learnt a few things that I thought I'd share here.

The first thing I found was that starting a todo list isn't a rather good idea. It sounds simple, but it's actually really useful. I found that I had a lot of small tasks I needed to complete, and I kept thinking of more things that needed doing at regular intervals. If I didn't write them down I'd never get anything done because I'd never be able to remember what needed doing first!

It also helps to do your research before you move. Make sure that you're properly reaquainted with all the software running on the system that you're going to migrate from, and that you're familiar with all the ins and outs of your particular situation. If you aren't, then you risk stumbling across some rather nasty, complicated, and time consuming problems mid-migration.

It also helps to do as much of the migration as possible without taking the old system offline. Install the software, Move the configuration files. Set up the firewall. Set up your new monitoring tools. This allows you to minimise the downtime that you have to subject your users to, which is always a good thing.

Lastly, testing is incredibly important. test everything. Make sure that every little feature after you migrate. You'd be surprised at how many issues can crop up after migration.

Getting started with arduino

An arduino and a simple circuit.

Since I've been playing around with the Arduino a bit recently (thank you Rob!) I thought I'd write a quick post on how you can get started with the arudino and it's many variants.

If you're into creating electronics and creating circuits, then the arduino is for you. The arduino is a small electronic board that you might find some variant thereof in your thermostat at home, or Rob's thing-o-matic for example. You'll probably find something akin to an arduino in most embedded systems.

To get started, you'll need to buy an Arduino Uno (you can probably find it cheaper elsewhere). Some LEDs, resistors, and jumper cables wouldn't hurt either.

Once you've got all that, you can start to have some fun. To compile and send programs to your new arudino, you'll need to download and install the Arduino IDE from the official arduino website (direct link for debian / ubuntu users). Once installed, connect your arduino to your computer using the supplied cable and open the IDE.

The menu options that need changing in the IDE

Next, we need to set the IDE up to send correctly compiled programs to our new board. Firstly, we need to tell the IDE what kind of board we have. Go to Tools->Board and select Arduino Uno. We also need to tell the IDE which programmer to use. Go to Tools->Programmer and select AVRISP mkII. Finally, we need to tell the IDE which serial port the arduino is connected on. Go to Tools->Serial Port and select the last item in the list. If the next steps don't work, try selecting a different option in this list until it works.

With that out of the way, we can start to test out our arduino! Arduinos are programmed using a variant of C, which is similar to GSGL. To get started quickly, let's send some example code to our arduino to start with. In the file menu, go to Examples->01. Basics and select Blink.

Selecting the example code in the file menu.

A new window will pop up containing the example code. To compile and send the code to your arduino, click the second button in from the left, with the right facing arrow on it. This will send the code to your arduino. Once it's done, you should see a flashing light on your arduino board!

The Arduino IDE interface.

The other buttons are also useful. Here's an explanation:

Verify - Compiles and checks your code for syntax errors, but doesn't write it to the arduino.
Upload - Compiles your code and sends it to your arduino.
New - Creates a new document. This clears your existing tab! Use the down arrow below the 6 in the picture and select New Tab instead.
Open - Opens an existing document. Again, this clears your existing tab.
Save - This should be obvious.
Opens the serial monitor. The serial monitor is like a very basic console which allows you to see what your arduino is saying and lets you send messages to it.

That just about covers my very basic getting started tutorial for the arduino. If you've got any questions or comments, please leave them down below.

Sources and Further Reading

The official Arduino website
Basic arduino tutorials - Learn more about the arduino board and what everything does.
Arduino language reference - Useful when writing code.
08249 Labs - A comprehensive set of tutorials that teach everything that you'll need to get started (Hull University students only).
Simple Arduino Serial Communication by Arduino Basics
Arduino Programming Cheat Sheet by Mark Liffiton

Learning Prolog: Semester 2 Lab 12

You may have noticed that I've skipped a week in this series. Don't panic - I've done it on purpose. The task for lab 11 felt very close to the assessed coursework (ACW) for the Artificial Intelligence module that I'm learning Prolog in. I've decided to play it safe and hold off on posting about it for now. I might post later - it all depends on what the ACW is.

Anyway, the main task for this lab was to finish the work we started last week, but there was an additional challenge, too. It was based on the following challenge:

SEND + MORE = MONEY

Each letter stands for a digit, and the idea is to write a Prolog program that work out what each of the letters stands for. This problem is actually quite simple, and the particular way that you'd go about solving this has a special name.

In the beginning, I only knew one way to writing programs: Functional programming. Next, University introduced me to the wonderful world of Object Oriented Programming. In my second year, my AI module has been bending my mind with Symbolic Programming. Now, I have learnt of a fourth programming paradigm - Constraint based programming.

The solution for the above was provided (and explained!) in the lecture slides for this week:


% From http://clip.dia.fi.upm.es/~vocal/public_info/seminar_notes/node13.html
smm :-
    % Let X be the set of variables which I am going to assign
    X = [S,E,N,D,M,O,R,Y],
    % ...and Digits is the unique set of digits to assign
    Digits = [0,1,2,3,4,5,6,7,8,9], /* assign recursively goes through the letter variables and assigns them a digit */        
    assign_digits(X, Digits),        
    M > 0,        S > 0,    
    % multiply to corresponding 10 to the n
    1000*S + 100*E + 10*N + D + 
    1000*M + 100*O + 10*R + E    =:=        
    10000*M + 1000*O + 100*N + 10*E + Y, 
    /*
    This is then the constraint in raw Prolog. =:= checks for equality in the expression.
    */
    write(X).

myselect(X, [X|R], R). % Assign to first and hand back the rest
myselect(X, [Y|Xs], [Y|Ys]):- myselect(X, Xs, Ys). /* ...otherwise on backtracking free up the old binding Y and go get another value of X from the other Xs */

assign_digits([], _List).
assign_digits([D|Ds], List):-
    myselect(D, List, NewList),
    assign_digits(Ds, NewList).

This code works by creating an array of unknowns, an array of digit values that the array of variables can take, and then assigning each of the variables a value one by one. The expression on lines 10-12 is then tested, and if it doesn't match (which it probably won't), Prolog backtracks and tries to reassign one of the variables it assigned before. It repeats this process until it finds the solution.

Brute force isn't very efficient or clever, but it does the job quickly enough:


smm.
[9, 5, 6, 7, 1, 0, 8, 2].

Very nice. But that wasn't actually the challenge to for the lab. The real challenge was to adapt the above code in order to solve a slightly different puzzle:

DONALD + GERALD = ROBERT

Thankfully, all this new problem takes is a slight adjustment to the smm/1 rule that the above solution came with:

dgr :-
    % Let X be the set of variables which I am going to assign
    X = [D, O, N, A, L, G, E, R, B, T],
    % ...and Digits is the unique set of digits to assign
    Digits = [0,1,2,3,4,5,6,7,8,9],
    % Assign recursively goes through the letter variables and assigns them a digit       
    assign_digits(X, Digits),        
    D > 0,        G > 0,    
    % Multiply to corresponding  10 to the n
    100000*D + 10000*O + 1000*N + 100*A + 10*L + D + 
           100000*G + 10000*E + 1000*R + 100*A + 10*L + D    =:=        
    100000*R + 10000*O + 1000*B + 100*E + 10*R + T,
    /*
    This is then the constraint in raw Prolog. =:= checks for equality in the expression.
    */  
    write(X).

All I did was alter the unknown variables to match the new problem. Here's my result:


smm.
[5, 2, 6, 4, 8, 1, 9, 7, 3, 0].

This may well be the last post (Edit:) in this series until the end of the ACW. See you later!

Learning Prolog: Semester 2 Lab 10 - Eliza

The lab numbering has gone a little bit strange this semester it seems (there was a lab 10 last semester), so I've added the semester number to the post title to avoid confusion.

This week's lab was all about Eliza. In case you don't know who (or what) an Eliza is, Eliza is an early attempt at building a chatbot. It isn't very clever, and relies on keywords to understand what you are saying. It's still used today though - mainly in NPCs in games, as they only have to deal with a very limited situation.

Anyway, let's begin building our very own Eliza. First of all, we need a prompt loop, to repeatedly ask the user for more input:

eliza :-
    write('Hello! My name is eliza.'), nl,
    eliza_loop.

eliza_loop :-
    write('Eliza > '),
    read(Input),
    write('You said '), write(Input), nl,
    eliza_loop.

Here I define eliza/0, which welcomes the user and then jumps into eliza_loop/0, which is the main input loop. The main loop simply writes a prompt, asks for input, echoes the input back, and goes back around all over again. Here's some example output:

eliza.
Hello! My name is eliza.
Eliza > test.
You said test
Eliza > cheese.
You said cheese
Eliza > [1,2,1,2,'testing...'].
You said [1,2,1,2,testing...]
Eliza >

This isn't very useful yet - it just echoes back what we say! Let's make it more complicated:

eliza :-
    write('Hello! My name is eliza.'), nl,
    eliza_loop.

eliza_loop :-
    write('Eliza > '),
    read(Input), respond(Input).

respond(Input) :-
    member(Term, Input),
    member(Term, [ quit, exit, leave ]),
    write('Goodbye!').
respond([my,name,is,Name | _ ]) :-
    write('Hello, '), write(Name), write('! Pleased to meet you.'), nl,
    eliza_loop.

In the above I've changed the main loop up a bit to call respond/1 instead of blindly looping back around. This lets me write rules like those starting on lines #8 and #12 that do something depending on the user's input. We can get our eliza to say goodbye and exit by not looping back around if the user mentions the words "quit", "exit", or "leave", or say hello when someone tells us their name and loop back around again.

There are also two different ways of detecting keywords in the user's input: a double member/2, or a preset list in the rule definition. Examples:

respond([my,name,is,Name | _ ]) :-
    write('You started the input with "my name is".').
respond(Input) :-
    member(Term, Input),
    member(Term, [ quit, exit, leave ]),
    write('Detected the word '), write(Term), write(' in the input.').

While detecting the items in a list is fairly straightforward, the double member isn't as obvious. It works by taking advantage of Prolog's backtracking. It provides a choice point for each word in the phrase, and then it loops over each of the keywords that we are searching for. If it finds that the current word isn't equal to any of the keywords that it is searching for, it will fail, backtrack, pick the next word, and try again. Another way of looking at it is that Prolog is trying all possible combinations of the input words and the keywords until it finds a match.

If you are following this post as a guide, at this point you can (and should!) add your own custom phrases to practice the techniques described here. Once you have done that, come back here.

Next up, we will add some 'fallback' phrases to our eliza to use when it doesn't understand what has been said. At the moment, it just fails with an impolite false., which isn't any good at all! To do this, we need a list of fallback phrases:

list_of_excuses(['I see.', 'Very interesting.', 'Tell me more.', 'Fascinating.']).

Next we need to get our eliza to loop through our list of fallback phrases in order. We can do this by making the fact that contains the list of fallback phrase dynamic, and then updating it every time we use one. To make something dynamic, you use the dynamic/1 predicate to tell Prolog that fact is dynamic and not static (this is because Prolog applies some clever optimisations to static things that don't work for dynamic things):

:- dynamic list_of_excuses/1

You can use this to tell Prolog that anything in your Prolog program is dynamic. Just change list_of_excuses to the name of the fact or rule, and change the 1 to the arity of the fact or rule that you want to make dynamic.

Next, we need to add a 'catch all' rule at the bottom of our respond/1 definition:

respond([ _ ]) :-
    retract(list_of_excuses([ Next | Rest ])),
    append(Rest, [ Next ], NewExcuseList),
    asserta(list_of_excuses(NewExcuseList)),
    write(Next), nl,
    eliza_loop.

Several things are happening here. firstly, we retract the list of fallback phrases from Prolog's knowledge database and split it up into the first item in the list, and the rest of the items in the list.

Next we create a new list that contains the phrase that we at the front at the end, and then we put this new list back into Prolog's knowledge database.

Lastly, we output the fallback phrase that was at the beginning of the list (but is now at the end) to the user as our reply, before looping back around again with exliza_loop/1.

With that, we our eliza can respond to phrase containing certain keywords, exit when asked, and reply with a fallback phrase if it doesn't understand what was said. Below I have put a small questions and answers type section containing a few of the problems that I had whilst writing this. I've also included the complete Prolog source code that this post is based on.

Common Problems

I had several problems whilst writing this program. I've included a few of them below.

My eliza fails when it doesn't understand something instead of outputting an error.

I found that I didn't get an error message if I included :- after the retract statement. Removing it made it output the error below which I was then able to solve.

I get this error when writing the fallback phrase bit:
ERROR: No permission to modify static_procedure Name/Arity
I'm certain that I put the dynamic name/arity at the top of my prolog source code, but it doesn't work.

I got this because I forgot the :- before the word dynamic. If you are getting this error, this isthe reason why.

Source Code

:- dynamic list_of_excuses/1.

list_of_excuses(['I see.', 'Very interesting.', 'Tell me more.', 'Fascinating.']).

eliza :-
    write('Hello! My name is eliza.'), nl,
    eliza_loop.

eliza_loop :-
    write('Eliza > '),
    read(Input), respond(Input).

respond(Input) :-
    member(Term, Input),
    member(Term, [ quit, exit, leave ]),
    write('Goodbye!').

respond([my,name,is,Name | _ ]) :-
    write('Hello, '), write(Name), write('! Pleased to meet you.'), nl,
    eliza_loop.

respond([my,Thing,is,called,Name | _ ]) :-
    write(Name), write(' is a nice name for a '), write(Thing), write('.'), nl,
    eliza_loop.
respond(Input) :-
    member(Animal, Input),
    member(Animal, [ cat, dog, fish, hamster, gerbil, snake, tortoise ]),
    write('You just mentioned your '), write(Animal), write('. Tell me more about your '), write(Animal), nl,
    eliza_loop.

respond(Input) :-
    member(Term, Input),
    member(Term, [ hate, dislike ]),
    member(Term2, Input),
    member(Term2, [ you ]),
    write(':('), nl,
    eliza_loop.

respond([ _ ]) :-
    retract(list_of_excuses([ Next | Rest ])),
    append(Rest, [ Next ], NewExcuseList),
    asserta(list_of_excuses(NewExcuseList)),
    write(Next), nl,
    eliza_loop.

(Try it on SWISH)

Learning Prolog: Warm up for semester 2 with the cinema

Learning Prolog is back, because apparently I am totally nuts and like to write long blog posts about the hardest programming language to use in existence. To kick off semester 2, we were given an old challenge: Write a cinema program that decides whether somebody can watch a given film or not. I remember this from my very first programming lab with Rob Miles - the first ever task that we were given was to build this exact program in C♯. Because Prolog is such a nasty and difficult language to learn, it has taken until now to learn all the concepts that you would need to write such a program.

In this post, I hope to guide you through the process of writing such a program in Prolog. To start with, here's an overview of what we will be building:

A Prolog program that will state whether or not a customer can see a movie depending on their age, the movie they want to see, and whether they have an adult with them.

We will need a few films to test this with. Here are the films that I was given in the lab:

Film Name	Rating
How to Lose Friends and Alienate People	15
Death Race	15
Space Chimps	U
The Chaser	18
Get Smart	12A

The first job here is to convert the above table into Prolog:

film('How to Lose Friends and Alienate People', 15).
film('Death Race', 15).
film('Space Chimps', 'U').
film('The Chaser', 18).
film('Get Smart', '12A').

Now we need some way to output the above so that the customer can see which films we are currently showing. Here's what I came up with:

all_films(FilmList) :-
    findall([ Film, Rating ], film(Film, Rating), FilmList).

list_films :-
    all_films(FilmList),
    list_films(FilmList, 0).
list_films([], _).
list_films([ [ FilmName, FilmRating ] | Rest ], Counter) :-
    write(Counter), write(' - '), write(FilmName), write(' - '), write(FilmRating), nl,
    NCounter is Counter + 1,
    list_films(Rest, NCounter).

As you should have come to expect with Prolog, the solution here is recursive. Firstly I define a rule to fetch me all the films with a quick findall/3. Then I define list_films/1 which calls all_films/1 to get all the films and passes them over to list_films/2, which actually does the work. I also have a counter here to keep track of the index of each film. This is important because we want to get the user to enter the number of the film that they want to watch.

Here's some example output of the above:

?- list_films.
0 - How to Lose Friends and Alienate People - 15
1 - Death Race - 15
2 - Space Chimps - U
3 - The Chaser - 18
4 - Get Smart - 12A
true.

This works by fetching all the films and loading them each into a list that contains their name and rating. These lists are then packaged into one big list, which is then spun through and outputted one by one, whilst incrementing the counter variable.

The next part of this problem is asking the user which film they want to see. Surprisingly, this isn't too tough.

ask_film(FilmNumber) :-
    write('Enter the number of the film that you would like to see: '),
    read(FilmNumber),
    true.

cinema_loop :-
    repeat,
    write('Welcome to our multiplex.'), nl,
    write('We are presently showing:'), nl,
    list_films,
    ask_film(FilmNumber),
    write('You chose option '), write(FilmNumber), write('.').

Here I define another rule to ask the user which film they want to see, and then define the main cinema loop. The main loop displays a welcome message, lists all the films that we are presently showing using the code we devised above, asks the user which film they want ot watch, and tells them which number they selected.

We are getting there, but telling the user which option they selected is a bit pointless, if you ask me. Let's upgrade it to tell us which film we selected and it's rating.

film_rating(FilmNumber, FilmRating) :-
    all_films(FilmList),
    nthelement(FilmNumber, FilmList, [ FilmName, FilmRating ]),
    write('You want to see '), write(FilmName), write(', which is a '), write(FilmRating), nl.

cinema_loop :-
    repeat,
    write('Welcome to our multiplex.'), nl,
    write('We are presently showing:'), nl,
    list_films,
    ask_film(FilmNumber),
    film_rating(FilmNumber, FilmRating),
    check_age(FilmRating),
    write('Enjoy the film!'), nl, true.

My code starts to get a little bit messy here (sorry!), but I've taken the cinema_loop/1 that we started above and upgraded to call film_rating/2. This new rule actually has a rather misleading name now that I think about it, but it's purpose essentially is to take a given film number and to return it's rating, whilst telling the user which film they selected.

film_rating/2 calls another rule that you might find familiar - the nth_element rule that I wrote in semester 1's last lab. GO and take a look at my other post if you are confused.

Now that we have that connector in place, we can start thinking about ratings. The rating system isn't as simple as it sounds when you start to think about it:

Rating	Meaning
U	Universal, suitable for all ages
PG	Parental Guidance, suitable for all ages at parents discretion
12	No one younger than 12 can see this film
12A	No one younger than 12 can see this film unless accompanied by an adult
15	No one younger than 15 can see this film
18	No one younger than 18 can see this film

Let's brack this down into chunks. In theory, we can do this with a single rule with 2 arities one for those rating that don't require the age of the customer, and those that do. We can write one part of the rule for each rating. Let's start with U:

check_age(FilmRating) :-
    FilmRating = 'U'.

Nice and simple, right? It succeeds if the film rating is 'U'. Let's move on. We can lump PG and 12A together if the customer is under 12 years of age:

check_age(FilmRating, Age) :-
    member(FilmRating, [ 'PG', '12A' ]),
    Age < 12,
    ask_adult(Adult),
    Adult = yes.

The above simple makes sure that the rating is either 'PG' or '12A', then it make sure that the customer is under 12, then it makes sure that the customer has an adult with them usng the rule below. If all of these comditions are met, then it succeeds.

ask_adult(Adult) :-
    write('Are you accompanied by an adult?'),
    read(Answer),
    member(Answer, [y, yes, yep]),
    Adult = yes,
    true.

If the user is 12 or over and wants to see a PG or a 12A, then we should let them through without asking them about an adult.

check_age(FilmRating, Age) :-
    member(FilmRating, [ 'PG', '12A' ]),
    Age >= 12.

Next we need to deal with the regular 12, 15, and 18 ratings. There are easy:

check_age(FilmRating, Age) :-
    not(FilmRating = 'PG'), not(FilmRating = '12A'),
    Age >= FilmRating.

We then need to connect the check_age/1 with the check_age/2. This is also fairly simple:

ask_age(Age) :-
    write('What is your age? '),
    read(Age).

check_age(FilmRating) :-
    ask_age(Age),
    check_age(FilmRating, Age).

If all else fails, then the customer mustn't be allowed to see the film. We should add another quick rule to tell the customer that they can't see the film before we are done:

check_age(_, _) :-
    write('Sorry, you can\'t see that film - try picking another one.'), nl,
    fail.

That completes the cinema program. Below you can find the source code for the whole thing that I based this post on. There are some unnecessary true.s floating around, but they are just relics from when I was debugging it and couldn't figure out what the problem was.

% Current films
film('How to Lose Friends and Alienate People', 15).
film('Death Race', 15).
film('Space Chimps', 'U').
film('The Chaser', 18).
film('Get Smart', '12A').

% Main loop
cinema_loop :-
    repeat,
    write('Welcome to our multiplex.'), nl,
    write('We are presently showing:'), nl,
    list_films,
    ask_film(FilmNumber),
    film_rating(FilmNumber, FilmRating),
    check_age(FilmRating),
    write('Enjoy the film!'), nl, true.

% Get all films
all_films(FilmList) :-
    findall([ Film, Rating ], film(Film, Rating), FilmList).

% Get the film with a given number
film_rating(FilmNumber, FilmRating) :-
    all_films(FilmList),
    nthelement(FilmNumber, FilmList, [ FilmName, FilmRating ]),
    write('You want to see '), write(FilmName), write(', which is a '), write(FilmRating), nl.

% Get the nth element
% From https://starbeamrainbowlabs.com/blog/article.php?article=posts%2F134-learning-prolog-lab-10.html
nthelement(0, [ Head | _ ], Head).
nthelement(N, [ _ | Tail ], Result) :-
    N > 0,
    NRecurse is N - 1,
    nthelement(NRecurse, Tail, Result).

% Check the user's age against the given film.
check_age(FilmRating) :-
    FilmRating = 'U'.
check_age(FilmRating) :-
    ask_age(Age),
    check_age(FilmRating, Age).
check_age(FilmRating, Age) :-
    member(FilmRating, [ 'PG', '12A' ]),
    Age < 12,
    ask_adult(Adult),
    Adult = yes, true.
check_age(FilmRating, Age) :-
    member(FilmRating, [ 'PG', '12A' ]),
    Age >= 12.
check_age(FilmRating, Age) :-
    not(FilmRating = 'PG'), not(FilmRating = '12A'),
    Age >= FilmRating.
check_age(_, _) :-
    write('Sorry, you can\'t see that film - try picking another one.'), nl,
    fail.

% Gets the user's film preference
ask_film(FilmNumber) :-
    write('Enter the number of the film that you would like to see: '),
    read(FilmNumber),
    true.

% Gets the user's age
ask_age(Age) :-
    write('What is your age? '),
    read(Age), true.

ask_adult(Adult) :-
    write('Are you accompanied by an adult?'),
    read(Answer),
    member(Answer, [y, yes, yep]),
    Adult = yes,
    true.

% Lists all films
list_films :-
    all_films(FilmList),
    list_films(FilmList, 0).
list_films([], _).
list_films([ [ FilmName, FilmRating ] | Rest ], Counter) :-
    write(Counter), write(' - '), write(FilmName), write(' - '), write(FilmRating), nl,
    NCounter is Counter + 1,
    list_films(Rest, NCounter).

Stardust Blog

Tag Cloud

On the value of the open source community

Conclusion

C# & .NET Terminology Demystified: A Glossary

C♯

Solution

Project

CIL

MSBuild

.NET Framework

WPF

C♯ 7

.NET Standard

ASP.NET

.NET Core

Conclusion

Sources and Further Reading

Demystifying Inverted Indexes

An epic journey awaits: The hows and whys of DNS (and why DNS privacy is important)

Enter Stage Left

Let's try it out

Am I being spied on?

But wait, there's more!

A simpler alternative

Conclusion

Sources and Further Reading

Transform your javascript with Browserify

Achievement Get: Upgrade Server - A writeup of my experiences

Getting started with arduino

Sources and Further Reading

Learning Prolog: Semester 2 Lab 12

Learning Prolog: Semester 2 Lab 10 - Eliza

Common Problems

Source Code

Learning Prolog: Warm up for semester 2 with the cinema

Stardust
Blog