Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling compression css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code performance photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Summer Project Part 5: When is a function not a function?

Another post! Looks like I'm on a roll in this series :P

In the last post, I looked at the box I designed that was ready for 3D printing. That process has now been completed, and I'm now in possession of an (almost) luminous orange and pink box that could almost glow in the dark.......

I also looked at the libraries that I'll be using and how to manage the (rather limited) amount of memory available in the AVR microprocessor.

Since last time, I've somehow managed to shave a further 6% program space off (though I'm not sure how I've done it), so most recently I've been implementing 2 additional features:

  • An additional layer of AES encryption, to prevent The Things Network for having access to the decrypted data
  • GPS delta checking (as I'm calling it), to avoid sending multiple messages when the device hasn't moved

After all was said and done, I'm now at 97% program space and 47% global variable RAM usage.

To implement the additional AES encryption layer, I abused LMiC's IDEETRON AES-128 (ECB mode) implementation, which is stored in src/aes/ideetron/AES-128_V10.cpp.

It's worth noting here that if you're doing crypto yourself, it's seriously not recommended that you use ECB mode. Please don't. The only reason that I used it here is because I already had an implementation to hand that was being compiled into my program, I didn't have the program space to add another one, and my messages all start with a random 32-bit unsigned integer that will provide a measure of protection against collision attacks and other nastiness.

Specifically, it's the method with this signature:

void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

Since this is an internal LMiC function declared in a .cpp source file with no obvious header file twin, I needed to declare the prototype in my source code as above - as the method will only be discovered by the compiler when linking the object files together (see this page for more information about the C++ compilation process. While it's for regular Linux executable binaries, it still applies here since the Arduino toolchain spits out a very similar binary that's uploaded to the microprocessor via a programmer).

However, once I'd sorted out all the typing issues, I slammed into this error:

/tmp/ccOLIbBm.ltrans0.ltrans.o: In function `transmit_send':
sketch/transmission.cpp:89: undefined reference to `lmic_aes_encrypt(unsigned char*, unsigned char*)'
collect2: error: ld returned 1 exit status

Very strange. What's going on here? I declared that method via a prototype, didn't I?

Of course, it's not quite that simple. The thing is, the file I mentioned above isn't the first place that a prototype for that method is defined in LMiC. It's actually in other.c, line 35 as a C function. Since C and C++ (for all their similarities) are decidedly different, apparently to call a C function in C++ code you need to declare the function prototype as extern "C", like this:

extern "C" void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

This cleaned the error right up. Turns out that even if a function body is defined in C++, what matters is where the original prototype is declared.

I'm hoping to release the source code, but I need to have a discussion with my supervisor about that at the end of the project.

Found this interesting? Come across some equally nasty bugs? Comment below!

Summer Project Part 4: Threading the needle and compacting it down

In the last part, I put the circuit for the IoT device together and designed a box for said circuit to be housed inside of.

In this post, I'm going to talk a little bit about 3D printing, but I'm mostly going to discuss the software aspect of the firmware I'm writing for the Arduino Uno that's going to be control the whole operation out in the field.

Since last time, I've completed the design for the housing and sent it off to my University for 3D printing. They had some great suggestions for improving the design like making the walls slightly thicker (moving from 2mm to 4mm), and including an extra lip on the lid to keep it from shifting around. Here are some pictures:

(Left: The housing itself. Right: The lid. On the opposite side (not shown), the screw holes are indented.)

At the same time as handling sending the housing off to be 3D printed, I've also been busily iterating on the software that the Arduino will be running - and this is what I'd like to spend the majority of this post talking about.

I've been taking an iterative approach to writing it - adding a library, interfacing with it and getting it to do what I want on it's own, then integrating it into the main program.... and then compacting the whole thing down so that it'll fit inside the Arduino Uno. The thing is, the Uno is powered by an ATmega328P (datasheet). Which has 32K of program space and just 2K of RAM. Not much. At all.

The codebase I've built for the Uno is based on the following libraries.

  • LMiC (the matthijskooijman fork) - the (rather heavy and needlessly complicated) LoRaWAN implementation
  • Entropy for generating random numbers as explained in part 2
  • TinyGPS, for decoding NMEA messages from the NEO-6M
  • SdFat, for interfacing with microSD cards over SPI

Memory Management

Packing the whole program into a 32K + 2K box is not exactly an easy challenge, I discovered. I chose to first deal with the RAM issue. This was greatly aided by the FreeMemory library, which tells you how much RAM you've got left at a given point in the execution of your program. While it's a bit outdated, it's still a useful tool. It works a bit like this:

#include <MemoryFree.h>;

void setup() {
    Serial.begin(115200);
    Serial.println(freeMemory, DEC);
    char test[] = "Bobs Rockets";
    Serial.println(freeMemory, DEC); // Should be lower than the above call
}

void loop() {
    // Nothing here
}

It's worth taking a moment to revise the way stacks and heaps work - and the differences between how they work in the Arduino environment and on your desktop. This is going to get rather complicated quite quickly - so I'd advise reading this stack overflow answer first before continuing.

First, let's look at the locations in RAM for different types of allocation:

  • Things on the stack
  • Things on the heap
  • Global variables

Unlike on the device you're reading this on, the Arduino does not support multiple processes - and therefore the entirety of the RAM available is allocated to your program.

Since I wasn't sure about preecisly how the Arduino does it (it's processor architecture-specific), I wrote a simple test program to tell me:

#include <Arduino.h>

struct Test {
    uint32_t a;
    char b;
};

Test global_var;

void setup() {
    Serial.begin(115200);

    Test stack;
    Test* heap = new Test();

    Serial.print(F("Stack location: "));
    Serial.println((uint32_t)(&stack), DEC);

    Serial.print(F("Heap location: "));
    Serial.println((uint32_t)heap, DEC);

    Serial.print(F("Global location: "));
    Serial.println((uint32_t)&global_var, DEC);
}

void loop() {
    // Nothing here
}

This prints the following for me:

Stack location: 2295
Heap location: 461
Global location: 284

From this we can deduce that global variables are located at the beginning of the RAM space, heap allocations go on top of globals, and the stack grows down starting from the end of RAM space. It's best explained with a diagram:

Now for the differences. On a normal machine running an operating system, there's an extra layer of abstraction between where things are actually located in RAM and where the operating system tells you they are located. This is known as virtual memory address translation (see also virtual memory, virtual address space).

It's a system whereby the operating system maintains a series of tables that map physical RAM to a virtual address space that the running processes actually use. Usually each process running on a system will have it's own table (but this doesn't mean that it will have it's own physical memory - see also shared memory, but this is a topic for another time). When a process accesses an area of memory with a virtual address, the operating system will transparently translate the address using the table to the actual location in RAM (or elsewhere) that the process wants to access.

This is important (and not only for security), because under normal operation a process will probably allocate and deallocate a bunch of different lumps of memory at different times. With a virtual address space, the operating system can defragment the physical RAM space in the background and move stuff around without disturbing currently running processes. Keeping the free memory contiguous speeds up future allocations, and ensures that if a process asks for a large block of contiguous memory the operating system will be able to allocate it without issue.

As I mentioned before though, the Arduino doesn't have a virtual memory system - partly because it doesn't support multiple processes (it would need an operating system for that). The side-effect here is that it doesn't defragment the physical RAM. Since C/C++ isn't a managed language, we don't get _heap compaction_ either like in .NET environments such as Mono.

All this leads us to an environment in which heap allocation needs to be done very carefully, in order to avoid fragmenting the heap and causing a stack crash. If an object somewhere in the middle of the heap is deallocated, the heap will not shrink until everything to the right of it is also deallocated. This post has a good explanation of the problem too.

Other things we need to consider are keeping global variables to a minimum, and trying to keep most things on the stack if we can help it (though this may slow the program down if it's copying things between stack frames all the time).

To this end, we need to choose the libraries we use with care - because they can easily break these guidelines that we've set for ourselves. For example, the inbuilt SD library is out, because it uses a global variable that eats over 50% of our available RAM - and it there's no way (that I can see at least) to reclaim that RAM once we're finished with it.

This is why I chose SdFat instead, because it's at least a little better at allowing us to reclaim most of the RAM it used once we're finished with it by letting the instance fall out of scope (though in my testing I never managed to reclaim all of the RAM it used afterwards).

Alternatives like ยต-Fat do exist and are even lighter, but they have restrictions such as no appending to files for example - which would make the whole thing much more complicated since we'd have to pre-allocate the space for the file (which would get rather messy).

The other major tactic you can do to save RAM is to use the F() trick. Consider the following sketch:

#include 

void setup() {
    Serial.begin(115200);
    Serial.println("Bills boosters controller, version 1");
}

void loop() {
    // Nothing here
}

On the highlighted line we've got an innocent Serial.println() call. What's not obvious here is that the string literal here is actually copied to RAM before being passed to Serial.println() - using up a huge amount of our precious working memory. Wrapping it in the F() macro forces it to stay in your program's storage space:

Serial.println(F("Bills boosters controller, version 1"));

Saving storage

With the RAM issue mostly dealt with, I then had to deal with the thorny issue of program space. Unfortunately, this is not as easy as saving RAM because we can't just 'unload' something when it's not needed.

My approach to reducing program storage space was twofold:

  • Picking lightweight alternatives to libraries I needed
  • Messing with flags of said libraries to avoid compiling parts of libraries I don't need

It is for these reasons that I ultimately went with TinyGPS instead of TinyGPS++, as it saved 1% or so of the program storage space.

It's also for this reason that I've disabled as much of LMiC as possible:

#define DISABLE_JOIN
#define DISABLE_PING
#define DISABLE_BEACONS
#define DISABLE_MCMD_DCAP_REQ
#define DISABLE_MCMD_DN2P_SET

This disables OTAA, Class B functionality (which I don't need anyway), receiving messaages, the duty cycle cap system (which I'm not sure works between reboots), and a bunch of other stuff that I'd probably find rather useful.

In the future, I'll probably dedicate an entire microcontroller to handling LoRaWAN functionality - so that I can still use the features I've had to disable here.

Even doing all this, I still had to trim down my Serial.println() calls and remove any non-essential code to bring it under the 32K limit. As of the time of typing, I've got jut 26 bytes to spare!

Next time, after tuning the TPL5110 down to the right value, we're probably going to switch gears and look at the server-side of things - and how I'm going to be storing the data I receive from the Arudino-based device I've built.

Found this interesting? Got a suggestion? Comment below!

Summer Project Part 3: Putting it together

In the first post in this series, I outlined my plans for my Msc summer project and what I'm going to be doing. In the second post, I talked about random number generation for the data collection.

In this post, I'm going to give a general progress update - which will mostly centre around the Internet of Things device I'm building to collect the signal strength data.

Since the last post, I've got nearly all the parts I need for the project, except the TPL5111 power manager and 4 rechargeable AA batteries (which should be easy to come by - I'm sure I've got some lying around somewhere).

I've also wired the thing up, with a cable standing in for the TPL5111.

The IoT device all wired up. It basically consists of an Arduino Uno with a red Dragino LoRa shield on top, with a pair of small breadboards containing the peripherals and black power management boards respectively.

The power management board there technically doesn't need a breadboard, but it makes mounting it in the box easier.

I still need to splice the connector onto the battery box I had lying around with some soldering and electrical tape - I'll do that later this week.

The wiring there is kind of messy, but I've tested each device individually and they all appear to work as intended. Here's a clearer diagram of what's going on that drew up in Fritzing (sudo apt install fritzing for Linux users):

Speaking of mounting things in the box, I've discovered OpenSCAD thanks to help from a friend and have been busily working away at designing a box to put everything in that can be 3D printed:

I've just got the lid to do next (which I'm going to do after writing this blog post), and then I'm going to get it printed.

With this all done, it's time to start working on the transport for the messages - namely using LMIC to connection to the network and send the GPS location to the application server, which is also unfinished.

The lovely people at the hardware meetup have lent me a full 8-channel LoRaWAN gateway that's connected to The Things Network for my project, which will make this process a lot easier.

Next time, I'll likely talk about 3D printing and how I've been 'threading the needle', so to speak.

Easy Paper Referencing and Research

For my Msc (Masters in Science) degree that I'm currently working towards, I'm having to do an increasing amount of research and 'official' referencing in my University's referencing style.

Unlike when I first started at University, however, by now I've developed a number of strategies to aid me in doing this referencing properly with minimal effort - and actually finding papers in the first place. To this end, I wanted to write a quick post on what I do for my own reference - and hopefully someone else will find it useful too (comment below!).

Although I really like using Markdown for writing blog posts and other documents, when I've got to do proper referencing I often end up using LaTeX instead. I first used LaTeX for my interim report in my final year of my undergraduate degree, and while I'm still looking for a decent renderer (I think I'm using TeX Live, and it's a pain), it's great for referencing because of BibTeX. With the template provided by my University, I can enter something like this:

@Misc{Techopedia2017,
    author = {{Techopedia Inc.}},
    year = {2017},
    title = {What is Hamming Distance? - Definition from Techopedia},
    howpublished = {Available online: \url{https://www.techopedia.com/definition/19723/hamming-distance} [Accessed 04/04/2019]}
}

...and it will automagically convert it into the right style according to the template my University has given me. Then I just reference it (\citep{Techopedia2017}), and I'm away!

For actually finding papers, I've been finding Google Scholar useful. It even has a button you can press that generates the BibTeX definition as shown above for you to paste into your references file!

Finally, I've just recently discovered Microsoft Academic. While I haven't used it too much yet, it seems to be a great alternative to Google Scholar. It's got a cool AI-based semantic search engine, and an interesting summary view of how a given paper relates to other papers to help you find more papers on a topic. It shows a papers references,the papers that cite that paper, and papers that it determines to be related - giving you plenty to choose from.

Using a combination of these approaches, I've been able to effectively focus on the finding and referencing papers, without getting bogged down too much with the semantics of how I should actually do the referencing itself.

Found this useful? Got another tip? Comment below!

Delivering Linux 101

Achievement get: Deliver workshop!

At the beginning of my time here at University I never thought I'd be planning and leading the delivery an entire workshop on the basics of Linux. Assessed coursework presentations have nothing on this!

Overall, I think it went rather well, actually. About a dozen people attended in total, and most people seemed to manage to get near the end of the tasks I had prepared:

  1. Installing Ubuntu
  2. Installing Mono
  3. Investigating Monodevelop

I think next time I want to better prepare for the gap when installing the operating system, as it took much longer than I expected. Perhaps choosing the "minimal" installation instead of the "normal" installation would help here?

Preparing some slides on things like the folder structure and layout, and re-ordering the slides about package management would would all help.

If I can't cut down on the installation time, pre-installed virtual machines would also work - but I'd like to keep the OS installation if possible to show that it's an easy process installing Ubuntu on their own machines.

Moving forwards, I've already received a bunch of feedback on what future sessions could contain:

  1. Setting up remote access
    • This would be SSH, which is already installed & pre-setup on a server installation of Ubuntu
  2. Gaming
    • I unsure precisely what's meant by this. Is it installation of various games? Or maybe it's configuration of various platforms such as Steam? Perhaps someone could elaborate on it?
  3. Server installation & maintenance
    • Installation is largely similar to a desktop
    • I'd want to measure how long it takes to install, because much of the work with a server is the post-install tasks
    • Perhaps looking into a pre-installed server might be beneficial here, but security would be a slight concern

I think for anything more advanced, I'll probably go with a lab sheet-style setup instead, so that people can work at their own pace - especially since something like server configuration has many different steps to it.

I'd certainly want a goal to work towards for such a session. I've had some ideas already:

  • Setting up a web server
    • Installing Nginx
    • Writing and understanding configuration files
    • Possibly some FastCGI? PHP / Python? Probably not, what with everything else
  • Setting up a server to host a custom application
    • Writing systemd service files
    • Setting up log rotation

Common to both of these ideas would be:

  • Basic terminal skills
  • Uploading / downloading files
  • Basic hardening

I'm pretty sure I'll be doing another one of these sessions, although I'm unsure as to whether there's the demand for a repeat of this one.

If you've got any thoughts, let me know in the comments below!

Thanks also to @MoirkoB and everyone else who provided both time and resources to enable this to go ahead. Without, I'm sure it wouldn't have happened.

If you'd like to view the slide deck I used, you can do so here:

Linux 101 Slide Deck

If you missed it, but would like to be notified of future sessions, then fill out this Google Form:

Linux 101 Overflow

Compilers, VMs, and JIT: Spot the difference

It's about time for another demystification post, I think :P This time, I'm going to talk about Compilers, Virtual Machines (VMs), and Just-In-Time Compilation (JIT) - and the way that they are both related and yet different.

Compilers

To start with, a compiler is a program that converts another program written in 1 language into another language (usually of a lower level). For example, gcc compiles C++ into native machine code.

A compiler usually has both a backend and a frontend. The frontend is responsible for the lexing and parsing of the source programming language. It's the part of the compiler that generates compiler warnings and errors. The frontend outputs an abstract syntax tree (or parse tree) that represents the source input program.

The backend then takes this abstract syntax tree, walks it, and generates code in the target output language. Such code generators are generally recursive in nature. Depending on the compiler toolchain, this may or may not be the last step in the build process.

gcc, for example, generates code to object files - which are then strung together by a linking program later in the build process.

Additional detail on the structure of a compiler (and how to build one yourself!) is beyond the scope of this article, but if you're interested I recommend reading my earlier Compilers 101 post.

Virtual Machines

A virtual machine comes in several flavours. Common to all types is the ability to execute instructions - through software - as if they were being executed on a 'real' hardware implementation of the programming language in question.

Examples here include:

  • KVM - The Linux virtual machine engine. Leverages hardware extensions to support various assembly languages that are implemented in real hardware - everything from Intel x86 Assembly to various ARM instruction sets. Virtual Machine Manager is a good GUI for it.
  • VirtualBox - Oracle's offering. Somewhat easier to use than the above - and cross-platform too!
  • .NET / Mono - The .NET runtime is actually a virtual machine in it's own right. It executes Common Intermediate Language in a managed environment.

It's also important to note the difference between a virtual machine and an emulator. An emulator is very similar to a virtual machine - except that it doesn't actually implement a language or instruction set itself. Instead, it 'polyfills' hardware (or environment) features that don't exist in the target runtime environment. A great example here is WINE or Parallels, which allow programs written for Microsoft Windows to be run on other platforms without modification.

Just-In-Time Compilation

JIT is sort of a combination of the above. The .NET runtime (officially knows as the Common Language Runtime, or CLR) is an example of this too - in that it compiles the CIL in the source assembly to native code just before execution.

Utilising such a mechanism does result in an additional delay during startup, but usually pays for itself in the vast performance improvements that can be made over an interpreter - as a JITting VM can automatically optimise code as it's executing to increase performance in real-time.

This is different to an interpreter, which reads a source program line-by-line - parsing and executing it as it goes. If it needs to go back to another part of the program, it will usually have to re-parse the code before it can execute it.

Conclusion

In this post, we've taken a brief look at compilers, VMs, and JIT. We've looked at the differences between them - and how they are different from their siblings (emulators and interpreters). In many ways, the line between the 3 can become somewhat blurred (hello, .NET!) - so learning the characteristics of each is helpful for disentangling different components of a program.

If there's anything I've missed - please let me know! If you're still confused on 1 or more points, I'm happy to expand on them if you comment below :-)

Found this useful? Spotted a mistake? Comment below!

Disassembling .NET Assemblies with Mono

As part of the Component-Based Architectures module on my University course, I've been looking at what makes the .NET ecosystem tick, and how .NET assemblies (i.e. .NET .exe / .dll files) are put together. In the process, we looked as disassembling .NET assemblies into the text-form of the Common Intermediate Language (CIL) that they contain. The instructions on how to do this were windows-specific though - so I thought I'd post about the process on Linux and other platforms here.

Our tool of choice will be Mono - but before we get to that we'll need something to disassemble. Here's a good candidate for the role:

using System;

namespace SBRL.Demo.Disassembly {
    static class Program {
        public static void Main(string[] args) {
            int a = int.Parse(Console.ReadLine()), b = 10;
            Console.WriteLine(
                "{0} + {1} = {2}",
                a, b,
                a + b
            );
        }
    }
}

Excellent. Let's compile it:

csc Program.cs

This should create a new Program.exe file in the current directory. Before we get to disassembling it, it's worth mentioning how the compilation and execution process works in .NET. It's best explained with the aid of a diagram:

Left-to-right flowchart: Multiple source languages get compiled into Common Intermediate Language, which is then executed by an execution environment.

As is depicted in the diagram above, source code in multiple languages get compiled (maybe not with the same compiler, of course) into Common Intermediate Language, or CIL. This CIL is then executed in an Execution Environment - which is usually a virtual machine (Nope! not as in Virtual Box and KVM. It's not a separate operating system as such, rather than a layer of abstraction), which may (or may not) decide to compile the CIL down into native code through a process called JIT (Just-In-Time compilation).

It's also worth mentioning here that the CIL code generated by the compiler is in binary form, as this take up less space and is (much) faster for the computer to operate on. After all, CIL is designed to be efficient for a computer to understand - not people!

We can make it more readable by disassembling it into it's textual equivalent. Doing so with Mono is actually quite simple:

monodis Program.exe >Program.il

Here I redirect the output to a file called Program.il for convenience, as my editor has a plugin for syntax-highlighting CIL. For those reading without access to Mono, here's what I got when disassembling the above program:

.assembly extern mscorlib
{
  .ver 4:0:0:0
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .z\V.4..
}
.assembly 'Program'
{
  .custom instance void class [mscorlib]System.Runtime.CompilerServices.CompilationRelaxationsAttribute::'.ctor'(int32) =  (01 00 08 00 00 00 00 00 ) // ........

  .custom instance void class [mscorlib]System.Runtime.CompilerServices.RuntimeCompatibilityAttribute::'.ctor'() =  (
        01 00 01 00 54 02 16 57 72 61 70 4E 6F 6E 45 78   // ....T..WrapNonEx
        63 65 70 74 69 6F 6E 54 68 72 6F 77 73 01       ) // ceptionThrows.

  .custom instance void class [mscorlib]System.Diagnostics.DebuggableAttribute::'.ctor'(valuetype [mscorlib]System.Diagnostics.DebuggableAttribute/DebuggingModes) =  (01 00 07 01 00 00 00 00 ) // ........

  .hash algorithm 0x00008004
  .ver  0:0:0:0
}
.module Program.exe // GUID = {D6162DAD-AD98-45B3-814F-C646C6DD7998}

.namespace SBRL.Demo.Disassembly
{
  .class private auto ansi beforefieldinit Program
    extends [mscorlib]System.Object
  {

    // method line 1
    .method public static hidebysig 
           default void Main (string[] args)  cil managed 
    {
        // Method begins at RVA 0x2050
    .entrypoint
    // Code size 47 (0x2f)
    .maxstack 5
    .locals init (
        int32   V_0,
        int32   V_1)
    IL_0000:  nop 
    IL_0001:  call string class [mscorlib]System.Console::ReadLine()
    IL_0006:  call int32 int32::Parse(string)
    IL_000b:  stloc.0 
    IL_000c:  ldc.i4.s 0x0a
    IL_000e:  stloc.1 
    IL_000f:  ldstr "{0} + {1} = {2}"
    IL_0014:  ldloc.0 
    IL_0015:  box [mscorlib]System.Int32
    IL_001a:  ldloc.1 
    IL_001b:  box [mscorlib]System.Int32
    IL_0020:  ldloc.0 
    IL_0021:  ldloc.1 
    IL_0022:  add 
    IL_0023:  box [mscorlib]System.Int32
    IL_0028:  call void class [mscorlib]System.Console::WriteLine(string, object, object, object)
    IL_002d:  nop 
    IL_002e:  ret 
    } // end of method Program::Main

    // method line 2
    .method public hidebysig specialname rtspecialname 
           instance default void '.ctor' ()  cil managed 
    {
        // Method begins at RVA 0x208b
    // Code size 8 (0x8)
    .maxstack 8
    IL_0000:  ldarg.0 
    IL_0001:  call instance void object::'.ctor'()
    IL_0006:  nop 
    IL_0007:  ret 
    } // end of method Program::.ctor

  } // end of class SBRL.Demo.Disassembly.Program
}

Very interesting. There are a few things of note here:

  • The metadata at the top of the CIL tells the execution environment a bunch of useful things about the assembly, such as the version number, the classes contained within (and their signatures), and a bunch of other random attributes.
  • An extra .ctor method has been generator for us automatically. It's the class' constructor, and it automagically calls the base constructor of the object class, since all classes are descended from object.
  • The ints a and b are boxed before being passed to Console.WriteLine. Exactly what this does and why is quite complicated, and best explained by this Stackoverflow answer.
  • We can deduce that CIL is a stack-based language form the add instruction, as it has no arguments.

I'd recommend that you explore this on your own with your own test programs. Try changing things and see what happens!

  • Try making the Program class static
  • Try refactoring the int.Parse(Console.ReadLine()) into it's own method. How is the variable returned?

This isn't all, though. We can also recompile the CIL back into an assembly with the ilasm code:

ilasm Program.il

This makes for some additional fun experiments:

  • See if you can find where b's value is defined, and change it
  • What happens if you alter the Console.WriteLine() format string so that it becomes invalid?
  • Can you get ilasm to reassemble an executable into a .dll library file?

Found this interesting? Discovered something cool? Comment below!

Sources and Further Reading

Converting my timetable to ical with Node.JS and Nightmare

A photo of a nice beach in a small bay, taken from a hill off to the side. A small-leaved tree in the foreground frames the bottom-left, with white-crested waves breaking over the beach in the background, before riding steeply on the way inland.

(Source: Taken by me!)

My University timetable is a nightmare. I either have to use a terrible custom app for my phone, or an awkwardly-built website that feels like it's at least 10 years old!

Thankfully, it's not all doom and gloom. For a number of years now, I've been maintaining a Node.JS-based converter script that automatically pulls said timetable down from the JSON backend of the app - thanks to a friend who reverse-engineered said app. It then exports it as a .ical file that I can upload to my server & subscribe to in my pre-existing calendar.

Unfortunately, said backend changed quite dramatically recently, and broke my script. With the only alternative being the annoying timetable website that really don't like being scraped.

Where there's a will, there's a way though. Not to be deterred, I gave it a nightmare of my own: a scraper written with Nightmare.JS - a Node.JS library that acts, essentially, as a scriptable web-browser!

While the library has some kinks (especially with .wait("selector")), it worked well enough for me to implement a scraper that pulled down my timetable in HTML form, which I then proceeded to parse with cheerio.

The code is open-source (find it here!) - and as of this week I've updated it to work with the new update to the timetabling system this semester. A further update will be needed in early December time, which I'll also be pushing to the repository.

The README of the repository should contain adequate instructions for getting it running yourself, but if not, please open an issue!

Note that I am not responsible for anything that happens as a result of using this script! I would strongly recommend setting up the secure storage of your password if you intend to automate it. I've just written this to solve a problem in order to ensure that I can actually get to my lectures on time - and not an hour late or on the wrong week because I've misread the timetable (again)!

In the future, I'd like to experiment with other scriptable web-browser frameworks to compare them with my experiences with NightmareJS.

Found this interesting? Found a better way to do this? Comment below!

Achievement Get: Complete Degree!

Check out the strawberry I found in my greenhouse!

Hey! I've just realised that this was my 300th post. Wow! Thanks for all the support so far. Here's to the next 300 :D

I've just finish my degree at University this week. I'm still waiting on results, but I thought I'd make a post about it documenting my thoughts so far before I forget (also, happy 300th post! Wow, have I reached that already?). Note that this doesn't mean the end of this blog - far from it! I'll be doing a masters this next academic year.

It's been a great journey to have the chance to go on. I feel like I've improved in so many different ways having gone to University. Make no mistake: University isn't for everyone (if you're considering it, make sure you do your research about all your options!) - but I've found that it's been right for me.

I'm glad that Rob Miles' suggestion of starting a blog has been a great investment of my time. For one, I've been able to document the things that I've been learning so that I can come back to them later, and read a more personalised guide to the thing I blogged about. I've also learnt a ton about Linux server management too - as I manage the server that this blog runs on entirely through the terminal (sorry, hackers who want to get into my non-existent management web interface - I know you're out there - you leave pawprints all over my server logs!). All very valuable experiences - I highly suggest that you start one too (you won't regret it. I promise!).

I've also found that my eyes have been opened in more ways than one whilst doing my degree - both to new ways of approaching problems, and new ways of solving them - and many other things that would take too long to list here). I've blogged about some of my favourite modules in this regard before - particularly Virtual Reality and Languages and Compilers.

Thanks to all the amazing people I've met along the way, I've ended up in a much better place than when I started.

Distributing work with Node.js

A graph of the data I generated by writing the scripts I talk about in this post. (Above: A pair of graphs generated with gnuplot from the data I crunched with the scripts I talk about in this blog post. Anti-aliased version - easier to pick details out [928.1 KiB])

I really like Node.js. For those not in the know, it's basically Javascript for servers - and it's brilliant at networking. Like really really good. Like C♯-beating good. Anyway, last week I had a 2-layer neural network that I wanted to simulate all the different combinations from 1-64 nodes in both layers for, as I wanted to generate a 3-dimensional surface graph of the error.

Since my neural network (which is also written in Node.js :P) has a command-line interface, I wrote a simple shell script to drive it in parallel, and set it going on a Raspberry Pi I have acting as a file server (it doesn't do much else most of the time). After doing some calculations, I determined that it would finish at 6:40am Thursday..... next week!

Of course, taking so long is no good at all if you need it done Thursday this week - so I set about writing a script that would parallelise it over the network. In the end I didn't actually include the data generated in my report for which I had the Thursday deadline, but it was a cool challenge nonetheless!

Server

To start with, I created a server script that would allocate work items, called nodecount-surface-server.js. The first job was to set things up and create a quick settings object and a work item generator:

#!/usr/bin/env node
// ^----| Shebang to make executing it on Linux easier

const http = require("http"); // We'll need this later

const settings = {
    port: 32000,
    min: 1,
    max: 64,
};
settings.start = [settings.min, settings.min];

function* work_items() {
    for(let a = settings.start[0]; a < settings.max; a++) {
        for(let b = settings.start[1]; b < settings.max; b++) {
            yield [a, b];
        }
    }
}

That function* is a generator. C♯ has them too - and they let a function return more than one item in an orderly fashion. In my case, it returns arrays of numbers which I use as the topology for my neural networks:

[1, 1]
[1, 2]
[1, 3]
[1, 4]
....

Next, I wrote the server itself. Since it was just a temporary script that was running on my local network, I didn't implement too many security measures - please bear this in mind if using or adapting it yourself!


function calculate_progress(work_item) {
    let i = (work_item[0]-1)*64 + (work_item[1]-1), max = settings.max * settings.max;
    return `${i} / ${max} ${(i/max*100).toFixed(2)}%`;
}

var work_generator = work_items();

const server = http.createServer((request, response) => {
    switch(request.method) {
        case "GET":
            let next = work_generator.next();
            let next_item = next.value;
            if(next.done)
                break;
            response.write(next_item.join("\t"));
            console.error(`[allocation] [${calculate_progress(next_item)}] ${next_item}`);
            break;
        case "POST":
            var body = "";
            request.on("data", (data) => body += data);
            request.on("end", () => {
                console.log(body);
                console.error(`[complete] ${body}`);
            })
            break;
    }
    response.end();
});
server.on("clientError", (error, socket) => {
    socket.end("HTTP/1.1 400 Bad Request");
});
server.listen(settings.port, () => { console.error(`Listening on ${settings.port}`); });

Basically, the server accepts 2 types of requests:

  • GET requests, which ask for work
  • POST requests, which respond with the results of a work item

In my case, I send out work items like this:

11  24

...and will be receiving work results like this:

11  24  0.2497276811644629

This means that I don't even need to keep track of which work item I'm receiving a result for! If I did though, I'd probably having some kind of ID-based system with a list of allocated work items which I could refer back to - and periodically iterate over to identify any items that got lost somewhere so I can add them to a reallocation queue.

With that, the server was complete. It outputs the completed work item results to the standard output, and progress information to the standard error. This allows me to invoke it like this:

node ./nodecount-surface-server.js >results.tsv

Worker

Very cool. A server isn't much good without an army of workers ready and waiting to tear through the work items it's serving at breakneck speed though - and that's where the worker comes in. I started writing it in much the same way I did the server:

#!/usr/bin/env node
// ^----| Another shebang, just like the server

const http = require("http"); // We'll need this to talk to the server later
const child_process = require("child_process"); // This is used to spawn the neural network subprocess

const settings = {
    server: { host: "172.16.230.58", port: 32000 },
    worker_command: "./network.js --epochs 1000 --learning-rate 0.2 --topology {topology} <datasets/acw-2-set-10.txt 2>/dev/null"
};

That worker_command there in the settings object is the command I used to execute the neural network, with a placeholder {topology} which we find-and-replace just before execution. Due to obvious reasons (no plagiarism thanks!) I can't release that script itself, but it's not necessary to understand how the distributed work item systme I've written works. It could just as well be any other command you like!

Next up is the work item executor itself. Since it obviously takes time to execute a work item (why else would I go to such lengths to process as many of them at once as possible :P), I take a callback as the 2nd argument (it's just like a delegate or Action in C♯):


function execute_item(data, callback) {
    let command = settings.worker_command.replace("{topology}", data.join(","));
    console.log(`[execute] ${command}`);
    let network_process = child_process.exec(command, (error, stdout, stderr) =>  {
        console.log(`[done] ${stdout.trim()}`);
        let result = stdout.trim().split(/\t|,/g);
        let payload = `${result[0]}\t${result[1]}\t${result[5]}`;

        let request = http.request({
            hostname: settings.server.host,
            port: settings.server.port,
            path: "/",
            method: "POST",
            headers: {
                "content-length": payload.length
            }
        }, (response) => {
            console.log(`[submitted] ${payload}`);
            callback();
        });
        request.write(payload);
        request.end();
    });
}

In the above I substitute in the work item array as a comma-separated list, execute the command as a subprocess, report the result back to the server, and then call the callback. To report the result back I use the http module built-in to Node.JS, but if I were tidy this up I would probably use an npm package like got instead, as it simplifies the code a lot and provides more features / better error handling / etc.

A work item executor is no good without any work to do, so that's what I tackled next. I wrote another function that fetches work items from the server and executes them - wrapping the whole thing in a Promise to make looping it easier later:


function do_work() {
    return new Promise(function(resolve, reject) {
        let request = http.request({
            hostname: settings.server.host,
            port: settings.server.port,
            path: "/",
            method: "GET"
        }, (response) => {
            var body = "";
            response.on("data", (chunk) => body += chunk);
            response.on("end", () => {
                if(body.trim().length == 0) {
                    console.error(`No work item received. We're done!`);
                    process.exit();
                }
                let work_item = body.split(/\s+/).map((item) => parseInt(item.trim()));
                console.log(`[work item] ${work_item}`);
                execute_item(work_item, resolve);
            });
        });
        request.end();
    });
}

Awesome! It's really coming together. Doing just one work item isn't good enough though, so I took it to the next level:

function* do_lots_of_work() {
    while(true) {
        yield do_work();
    }
}

// From https://starbeamrainbowlabs.com/blog/article.php?article=posts/087-Advanced-Generators.html
function run_generator(g) {
    var it = g(), ret;

    (function iterate() {
        ret = it.next();
        ret.value.then(iterate);
    })();
}

run_generator(do_lots_of_work);

Much better. That completed the worker script - so all that remained was to set it going on as many machines as I could get my hands on, sit back, and watch it go :D

I did have some trouble with crashes at the end because there was no work left for them to do, but it didn't take (much) fiddling to figure out where the problem(s) lay.

Each instance of the worker script can max out a single core of a machine, so multiple instances of the worker script are needed per machine in order to fully utilise a single machine's resources. If I ever need to do this again, I'll probably make use of the built-in cluster module to simplify it such that I only need to start a single instance of the worker script per machine instance of 1 for each core.

Come to think of it, it would have looked really cool if I'd done it at University and employed a whole row of machines in a deserted lab doing the crunching - especially since it was for my report....

Liked this post? Got an improvement? Comment below!

Art by Mythdael