Starbeamrainbowlabs

Stardust
Blog


Archive

Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d account algorithms announcement architecture archives arduino artificial intelligence artix assembly async audio bash batch blog bookmarklet booting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compilers compiling css dailyprogrammer debugging demystification distributed computing downtime electronics email embedded systems encryption es6 features event experiment external first impressions future game github github gist graphics hardware hardware meetup holiday html html5 html5 canvas infrastructure interfaces internet io.js jabber javascript js bin labs learning library linux low level lua maintenance manjaro network networking node.js operating systems performance photos php pixelbot portable privacy programming problems project projects prolog protocol protocols pseudo 3d python reddit reference release releases resource review rust secrets security series list server software sorting source code control statistics svg technical terminal textures three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Change the way you think: Languages and Compilers in Review

Change the way you think.

I haven't seen it used recently, but when I first arrived at Hull University to start my degree, that's the phrase I saw on a number of posters about the place. I've been thinking about it a lot over the course of my degree - and I've found that it has rung true more than one. My understanding of programming has been transformed 3 times that I can count, and before I get to the Languages and Compilers review that this post is supposed to be about, I'd like to talk a little bit about that first.

The first time my understanding transformed was when I arrived in my first year. Up until that point, I'd been entirely self-taught - learning languages such as _GML_ and later Javascript. All of a sudden, I was introduced to the concept of _Object-Oriented Programming_, and suddenly I understood that by representing things as classes and objects, it was possible to build larger programs without having them fall apart because they were a nightmare to maintain.

Then, when I was finishing up my year in industry, my understanding was transformed again. I realised the value of the experienced I'd had while I was out on my year in industry - and they have not only shown me mechanisms by which a project can be effectively managed, but they have also given me a bit more of an idea what I'd like to do when I finish my degree.

Finally, I think that in Languages and Compilers is the third time I've changed the way I think about programming. It's transformed my understanding of how programming languages are built, how their compilers and interpreters work, why things are they way they are. It's also shown me that there's no best programming language - there only the best one for the task at hand.

With this in hand, it gives me the tools I need to pick up and understand a new language much more easily than I could before, by comparing it's features to the ones that I already know about. I've found a totally new way of looking at programming languages: looking at them not on their own, but how their lexical style and paradigms compare to those employed by other languages.

If you're considering whether a degree in Computer Science is worth it, I'd say that if you're serious about programming for a living, then the answer is a resounding yes.

Have your own thoughts to add about (your?) CS degree? Have a question? Post a comment below!

Nasty Example SPL Programs

I've written a few example test programs to debug various edge cases whilst writing a compiler for my coursework at University, and I thought I've post them here to help others out. As this coursework is ongoing I'm not going to say anything else, but I hope that these help someone else out :-)

Anyone posting coursework compiler code here will have it deleted (but additional example SPL programs are welcome). In addition, I make no guarantees that these programs are even valid SPL that should compile.

Nasty A

NastyA:
DECLARATIONS
    a OF TYPE INTEGER;
CODE
    READ(a);
    WRITE(a);
    WRITE(45, 'c', 5.68249)
ENDP NastyA.

Nasty B

NastyB:

DECLARATIONS

    i, toggle, step, limit OF TYPE INTEGER;

CODE

    1 -> toggle;
    1 -> step;
    10 -> limit;
    FOR i IS 1 BY step TO limit DO
        IF toggle = 1 THEN
            -1 -> step;
            -1 -> toggle;
            -10 -> limit
        ENDIF;
        WRITE(i);
        NEWLINE;
        WRITE(toggle);
        NEWLINE;
        WRITE(step);
        NEWLINE;
        NEWLINE
    ENDFOR;
    NEWLINE

ENDP NastyB.

Nasty C

NastyC:

DECLARATIONS
    a,b,c OF TYPE INTEGER;
    d,e,f,g OF TYPE REAL;
CODE
    5 + 8 -> a;
    4.5 + 6.8 -> d;
    WRITE(a);
    NEWLINE;
    WRITE(d);
    NEWLINE;
    WRITE((8 - 4));
    NEWLINE
ENDP NastyC.

Compilers 101: Build your own flex + bison compiler in a few easy(?) steps

So you want to build your own compiler? Great! Don't know where to start? This guide should help! At University, we're building our own compiler for a custom programming language invented by our lecturer with a pair of GNU tools by the name of flex and bison - which I've blogged about before. Since that post, I've learnt a ton about how the whole process works, so I thought I'd write up a more coherent blog post on the subject :P

A diagram explaining the process of building a compiler. Explained below.

Stage 1: Planning

The whole process starts with railroad diagrams (also known as flowcharts) of the language you want to write a compiler for. Having an accurate set of railroad diagrams is essential to understanding precisely how the language is put together, which is rather useful for the next step.

Converting the railroad diagrams into plain BNF (Backus Naur Form). Unfortunately, bison doesn't support EBNF-like notation at the current time, so only plain-old BNF will do.

Stage 2: Lexing

With your railroad diagrams converted into BNF, you can start writing code! The first chunk of code that needs writing is the lexer. Lexing is what flex is good at - and involves converting the input source code into lexemes - discrete sequences of characters that match a particular pattern and can be assigned a particular category name, turning it into a token. Perhaps an example would help. Consider the following:

void do_awesome_stuff(int a, string b) {
    /* Code here */
}

The above can be turned into a sequence of tokens, not unlike the following (ignoring whitespace tokens, of course):

TYPE: void
IDENTIFIER: do_awesome_stuff
OPEN_BRACKET: (
TYPE: int
IDENTIFIER: a,
COMMA: ,
TYPE: string
IDENTIFIER: b
CLOSE_BRACKET: )
OPEN_BRACE: {
COMMENT: /* Code here */*
CLOSE_BRACE: }

See? We can identify 8 token types in the source string: TYPE, IDENTIFIER, COMMA, OPEN_BRACKET, CLOSE_BRACKET, OPEN_BRACE, COMMENT, and CLOSE_BRACE. These types and the rules to match them can be found by analysing a combination of the railroad diagrams and the BNF you created earlier.

Stage 3: Parser the first

With a lexer in hand, we can now look at writing the parser. This is done in two stages. The parser itself, and upgrading said parser to generate a parse tree.

Let's talk about the parser first. The parser can be largely created simply by running a few regular-expression find and replace rules on your BNF, actually. From there, it's just a case of adding the header and the footer to complete the document.

Let's take a look at some example BNF:

<instructions> ::= START <lines> END

<lines> ::= <lines> <line>
    | <line>

<line> ::= <command>

<command> ::= <cmd_name> <number>

<cmd_name> ::= FD
    | BK
    | LT
    | RT

<number> ::= <number> <digit>
    | <digit>

<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

The above matches something like the following:

START
FD 100
RT 180
FD 125
LT 90
BK 50
END

Very interesting (a virtual cookie is available for anyone who gets the reference as to what this grammar represents!). Let's look at converting that into something bison can understand:

instructions : START lines END
    ;

lines : lines line
    | line
    ;

line : command
    ;

command : cmd_name number
    ;

cmd_name : FD
    | BK
    | LT
    | RT
    ;

number : number digit
    | digit
    ;

digit : 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

That's looking much better already! Simply by using the regular expression substitutions:

  1. <([a-z_]+)> -> $1
  2. ::= -> :
  3. \n\n -> \n\t;\n\n

....we can get most of the way there to something that bison can understand. Next, we need to refactor it a bit to tell it which tokens are coming from the lexer (which I'll leave up to you to write as an exercise), so it doesn't get them confused with the rules - which are defined in the BNF(-like) rules.

Let's get rid of the rule for number and digit first, since we can do those in the lexer quite easily. Next, let's add a %token definition to the top to tell it which are coming from the lexer. It's good practice to define everything that comes from the lexer in uppercase, and everything that's a rule that exists only in bison in lowercase:

%start instructions
%token FD BK LT RT NUMBER START END

We've also defined a start symbol - the one which when bison reaches it, it knows that it's completed the parsing process, as bison is a bottom-up parser.

Lastly, we need to reference the lexer itself. Thankfully that's easy too by appending to your new bison file:

%%

#include "lex.yy.c"

Very nice. Don't forget about the new line at the end of the file - flex and bison will complain if it isn't present! Here's the completed bison file:

%start instructions
%token FD BK LT RT NUMBER START END

instructions : START lines END
    ;

lines : lines line
    | line
    ;

line : command
    ;

command : cmd_name NUMBER
    ;

cmd_name : FD
    | BK
    | LT
    | RT
    ;

%%

#include "lex.yy.c"

With a brand-new bison file completed, there's just one component of the parser left - a plain-old C file that calls it. Let's create one of those quickly:


#include 

int yyparse(void);

int main(void)
{       
    # if YYDEBUG == 1
    extern int yydebug;
    yydebug = 1;
    #endif

    return yyparse();
}

void yyerror(char *error_message)
{
    fprintf(stderr, "Error: %s\nExiting\n", error_message);
}

The highlighted lines enable a special debugging mode built-in to bison if the standard compile-time symbol YYDEBUG is specified - and bison is run with a few special parameters. Here's the sequence of commands needed to compile this:

flex lexer.l
bison -tv parser.y
gcc -Wall -Wextra -g parser.tab.c main.c -lfl -ly -DYYDEBUG -D_XOPEN_SOURCE=700

The gcc command is probably a bit long-winded, but it does several useful things for us:

  • Shows additional warnings just in case we've made a mistake that might be an issue later (-Wall -Wextra)
  • Include additional debugging information in the output file to allow debugging with gdb (the GNU Debugger) if necessary (-g)
  • Fix strange errors on some systems (-D_XOPEN_SOURCE=700)

If you're on a Windows system, you may need to remove the -ly - which appears to be required on the Linux machines I use - it tells gcc that we'll be referencing the bison library.

Stage 4: Parser again

Congratulations on getting this far! You've now got a lexer and a parser - so it's time to put them to use. This is done by utilising the parser to build a parse tree - a tree of nodes that represent the input. Here's an example tree:

An example parse tree.

As you can see, each high-level node references one or more lower-level nodes, and the structure of the tree represents the first 2 lines of the example input above. The nodes in yellow are lexical tokens that come directly from flex - these are called terminals, or leaf nodes. The ones in purple come from the bison rules (which we derived from the BNF we wrote at the beginning of this post) - and are non-terminals, or tree nodes.

With this in mind, let's introduce another feature or two of bison. Firstly, let's take a look at revising that %token declaration we created above:

%token FD BK LT RT START END
%token<val_num> NUMBER

The important bit here is the <val_num>. Here we tell bison that a value should be attached to the token NUMBER - and that it will be of type int. After telling bison that it should expect a value, we need to give it a place to put it. Let's write some more code to go just below the %token declarations:

%union {
    int val_num;
}

There we go! Excellent - we've got a place to put it. Now we just need to alter the lexer to convert the token value to an int and put it there. That's not too tough, thankfully - but if you're having trouble with it, here's a hint:

{number}        { yylval.val_num = atoi(yytext); return(NUMBER); }

Now we have it passing numbers correctly, let's look briefly at generating that parse tree. I'm not going to give the game away - just a few helpful hints as to what you need to do here - otherwise it's not as fun :P

Generating the parse tree can be considered the both the most challenging part of the experience (especially if you don't know what you're doing) and the easiest to deal with at same time. Knowing your stuff and your end goal before you start makes the whole process a lot easier.

The first major step is to create a struct that can represent a type of node in your parse tree. It might be useful to store several properties here - such as the node type (An enum will come in handy here), a spot for the value of a lexical token (or a reference to it in a symbol table if you have one), and references to child nodes in the parse tree.

The second major step of the process is to create a utility method that creates a new node of the tree on the heap, and then revise the bison file to get each rule to create new nodes on the tree in such a way that it creates a parse tree when it reaches start node (or top node of the tree - which, in the case of the above, is instructions). For the purposes of this post, I'll be using a method with this prototype:

TreeNode create_node(int item, int node_type, TreeNode left, TreeNode right);

Your tree node struct and subsequent creation method may vary. With this in hand, we can revise the bison rules we created above to create these nodes we've been talking about. Here's a quick pointer on how to revise the rule for command above:

command : cmd_name NUMBER   { $$ = create_node($2, NODE_COMMAND, $1, NULL); }
    ;

This might look a bit strange, but let's break it down. The bit in curly braces is some (almost) plain C code that creates the node and returns a pointer to it to bison. The $$ is the return value for that node - which, I might add reminds me of something I forgot above. We need to tell bison about our new tree node data type and which rules should return it:

%type<val_tnode> instructions lines line command cmd_name

/* And in %union { ... } ..... */
TreeNode val_tnode;

This is almost the same as the %token<val_num> we did before, but we're defining the return value of a rule this type - not a token. With that little interlude out of the way, let's return to the code above. $1 and $2 refer to the first and second items in the rule definition respectively - and hold the type that we defined above in the %token and %type directives. Since bison is a bottom-up parser - this means that by the time this code executes, all it's child nodes have (hopefully) been created - and we just have to tie them all up together with a new node. In the case of my little example above, $1 is of type TreeNode, and $2 is of type int (that is if I didn't make any mistakes further up!).

Stage 5: Blasting off to code generation and beyond

Phew! That's a lot of work. If you've read this far, thank you and well done! It's been a long journey for both you the reader and me the writer, but you're almost done.

While conceptually simple when broken down, the whole process actually gets rather involved - especially when writing the BNF and the parser (the latter of which can be a particular pain due to shift/reduce and reduce/reduce errors), and the amount of code and head scratching you've got to do to get to this point is enormous. My best advice is to take it slow and don't rush - you'll only cause most problems for yourself if you try and jump the gun. Make sure that each stage works as you intend before you continue - back-pedalling to fix an issue can be particularly annoying as it can be bothersome to work out which stage the bug is actually in.

The last step of the whole process is to actually do something with the parse tree we've worked so hard to create. Thankfully, that's not too difficult - as we can put some additional code in the { } block of the starting symbol to call methods that will do things like perform some optimisation, print the tree to the console - or generate some sweet code. While the actual generation of code is beyond the scope of this article, I may end up posting about some optimisation techniques you can use on a parse tree after I've finished fiddling with float handling, symbol tables, and initial code generation in my ACW (Assessed Course Work).

Found this useful? Found a bug in this post? Got a suggestion? Comment below! Since I don't have any real analytics on this blog besides the server logs, I've no idea if you've read it really unless you comment :P

Sources and Further Reading

Flexible Bison: Compiler Theory

One of the modules I've picked to do in my first semester of my third year at university is Lanuguages and their Compilers. Naturally, this entails building a compiler to compile a program that's written in a source language (spec provided, thankfully! :D) into plain old ANSI C.

The tools we're going to be using for this and the steps involved in actually compiling something into another language are somewhat complicated, and I'm having a bit of difficulty getting my head around the different steps a compiler goes through and how these steps relate to the tools we're going to be using. This blog post is my attempt to make sense of what I've learnt so far.

Firstly, let me introduce the tools I'll be using: GNU flex and GNU bison. Apparently they have a much shallower learning curve than other tools out there. At first, this doesn't appear to be the case - but the more I think about it the more I realise that this is true.

Flex, as far as I can tell, is a regular-expression based scanning tokeniser. In other words, it breaks down an input string into a series of tokens. It has a method that, when called, finds and returns the next token from the source string.

Bison uses tokenised output from flex to construct a parse tree. This parse tree is then optimised with redundant nodes removed, loops optimised, and other such tweaks. Finally, this optimised tree is then used to generate the output code.

With the cast introduced, I can get to the stages of a compiler:

  1. Lexical Analysis - Tokenisation
  2. Syntactical Analysis - Conversion of the token stream into a parse tree
  3. Semantic Analysis - Correction of the tree - e.g. automatic type conversion
  4. Intermediate Code Generation - Sometimes the compiler outputs sets of 3 values in a list of tuples. This was needed in older computers that couldn't hold all the steps of a compiler in memory at once! In my case, I'll be outputting the parse tree generated in step 3 I guess - but not to disk, as today we can have all the passes of the compiler in memory at the same time :D
  5. Optimisation - Redundant parts of the parse tree are removed etc. - loops are focused in particular
  6. Code Generation - The output code in the target language is generated here - whether that be in C (very common), Assembly, or another language.

This seems somewhat familiar. The Lexical Analysis phase seems to be rather similar to what flex is designed for, and the Semantic Analysis stage appears to what bison does. As for the other stages, I'm not really sure. I'm guessing that it'll become clear later as we build this compiler in stages - but I'm suspecting that we'll be writing them in plain C - unless I've missed something about bison :P

If you've made it this far, thanks for reading! If this feels somewhat disorganised - then it probably is - after all, this is mainly to get it all straight in my own head :P

If you've got any questions, please ask away in the comments below :-)

Understanding your compiler: C#

Sorry for the (very) late post! I fell rather ill on the day before I was going to write the next post, and haven't been well enough to write it until now! Hopefully more cool posts will be on their way soon :-)

A nice picture of Durham Cathedral. (Above: A nice picture of Durham Cathedral. Taken by @euruicimages.)

How many times have you just finished adding a new feature to your latest and greatest program, and hit the "Run" button in Visual Studio or Monodevelop? Have you ever wondered what happens under the hood? Have you ever encountered a strange build error, and just googled the error message in the hope of finding a solution?

In this post, I'm going take you on a journey to understand the build process for a typical C♯ program that happens every time you press Ctrl + Shift + B (or F8 in Monodevelop). I also hope to show you why I think it's important to understand what happens behind the scenes.

The hierarchy of the c♯ build system.

To start any journey, we need a map. I've created a simple diagram for our journey. We'll dive into the world of the project file further down. Lastly, we'll end our journey at the individual source code files that actually make up your project.

Before that though, we need to make a quick stop in your sln file. The sln file (or Solution File) is the place where everything starts. Normally, you only get one solution file for each project you create. It's the file that keeps track of the name of your project, its id, and where all the projects are that your solution references (they're usually in appropriately named subfolders). Note that it doesn't keep references to the other projects that you reference (that's done at the project file level).

Solution files sometimes automatically detected too (xbuild does this for sure). If you're in a command line or terminal, you can build an entire solution with just one command, without having to open Visual Studio or Monodevelop:

cd /path/to/awesome/project
# Windows
msbuild AwesomeSpaceProject.sln
# Linux / Mono
xbuild

With that out of the way, we can talk about project files. Project files define which source code files should be built, and how. You can even add custom triggers that run at any time before, during, or after the build process here (this is how I embedded commit hashes in a C♯ binary)!

Though the syntax is quite simple, much of it is, sadly, un or underdocumented. Thankfully most of it is quite intuitive, and a little experimentation goes a long way:

<Project>
    ....
    <ItemGroup>
        <Reference Include="System" />
        <Reference Include="System.Drawing" />
        ....
    </ItemGroup>
    <ItemGroup>
        <Compile Include="Program.cs" />
        <Compile Include="GameObjects/Spaceship.cs" />
        <Compile Include="GameObjects/Enemy.cs" />
        ....
    </ItemGroup>
    <ItemGroup>
        <EmbeddedResource Include="Resources\Spritesheet.png" />
        <EmbeddedResource Include="Resources\CoolSpace.ttf" />
    </ItemGroup>
    ....
</Project>

The above is a (simplified) example if a project file. It might be called something like CoolSpace.csproj. It references a few core assemblies, specifies a few C♯ files for compiling, and embeds a resource or two.

Of course, these files are generated for you automatically, but it's always helpful to know how to works so not only can you fix it more easily when it goes wrong, but you can also extend it to do extra things that you can't through the user interface, like use wildcards (careful! Too many wildcards can slow down your build) to specify a range of files that you wan to embed without having to go around them all one by one.

Next, let's talk about references. References allow you to pull in code from elsewhere, like a core system library, another project (that's how you link 2 projects in (or even out of!) a solution together), a library of sorts (like Nuget), or another random DLL or EXE (yes, you can reference other executable files) file lying around. With references, yo can spread your code around loads of files and pull in libraries from all over the galaxy and have the build process follow all of your references around and tie everything up into a nice neat package for you.

The next piece of the puzzle is the builder. The software that actually builds your code. On Windows with Visual Studio, it's called msbuild, and is the original build tool, created by Microsoft, that sets the standard. On Linux, there's a different (but very similar) tool called xbuild. xbuild implements the standard that msbuild sets, allowing solutions written on Windows with Visual Studio to be compiled 9 times out of 10 on Linux (and probably Mac too, though I don't have one to check - let me know in the comments if you have one!) without any changes.

The final stone in the bridge, so to speak, is your code itself. The builder (whether it be xbuild or msbuild) preprocesses each included file into an object file, which are then all linked together into the final executable binary, which itself contains common intermediate language (or CIL) which is executed by your operating system (or mono on Linux / Mac). While CIL is a different topic for a separate time, it's still important, so I'm mentioning it here.

If you've made it this far, congratulations! I hope that it made sense. If not (or even if it did!), please leave a comment down below and I'll try to help you out :-)

Transform your javascript with Browserify

Tired of battling endless <script> tags in your html files? Fed up with messing with a dozen libraries cluttering up the place? Can't see the wood from the trees? Try browserify (+ rollupify + wzrd)! It's amazing! It's awesome! It tidies up your code for you, so you don't have to (perhaps not :P)!

Seriously though, I've just been playing around with browserify, and it's awesome. It's that missing thing I've been trying to find for a long time. But what does it actually do, you ask?

Well, perhaps it's best to use an example. Consider these (relatively) harmless javascript files:

// SillySay.js
"use strict";

function sillySay(sentence) {
    // Split the sentence up into words
    var words = splitWords(sentence);

    // Loop over all the words in the above array and display them one by one
    for(let i in words) {
        alert(words[i]);
    }
}
// WordSplitter.js
"use strict";
function splitWords(sentence) {
    // Split the sentence on whitespace and return the resulting array
    return sentence.split(/\s+/g);
}

To use our (perfectly ridiculous) example code, we not only have to include SillySay.js, but WordSplitter.js (this could be a library you use for example) as well:

<!DOCTYPE html>
<html>
    <head>
        <meta charset='utf-8' />
        <title>Silly Say Demo</title>
    </head>
    <body>
        <p>Silly Say Demo</p>
        <p>By <a href="https://starbeamrainbowlabs.com/">Starbeamrainbowlabs</a></p>

        <!---------------->
        <script src="WordSplitter.js"></script>
        <script src="SillySay.js" charset="utf-8"></script>
        <script>
            window.addEventListener("load", function(event) {
                sillySay("This is a test");
            });
        </script>

        <style>
            html, body { font-size: 100%; }
            body
            {
                font-family: sans-serif;
            }
        </style>
    </head>
</html>

That's looking a bit messy, but imagine what it'd be like if you added another few libraries? Or a new feature in a separate file? See the problem? Browserify solves just this issue. It analyses the dependencies of the entry point to your app, and bundles up all your code into a single file, nice and neat. You can add extra transforms (like plugins), too, to do extra things like automatically insert your app's version, or include other data files automatically, or transpile other languages to javascript automagically (full list here).

Sounds cool yet? Let me give you a quick tutorial on how I set up Browserify, with Rollupify and Wzrd.

Firstly, we need to set things up. If you don't have Node.js installed, do that now. You'll also get npm - Node's (perfectly awesome!) package manager. Next, let's create a quick project and paste in the code above. I've recorded an asciicast (as you may have seen a few times before here) of me going through the process:

(Can't see the asciicast above? Try viewing it here

If you'd like to take a look at the final result, as written in the asciicast above, you can find it over here. Questions and comments are welcome below :-)

The lost post: Embedding commit hashes in C♯ binaries

I seem to remember that I've started to write this post no less than 3 times, and I've managed to lose the source each and every time. If you're reading this it means that I've managed to complete it this time :D

Imagine you've got a confused client on the phone, asking why the newest feature X in your program doesn't work. You ask them whether they have the latest version of your program installed... and they say that they don't know. Version numbers and changelogs to the rescue! Except.... that last release was rather rushed and you forgot to finish updating the changelog. This is just one scenario in which embedding the latest commit hash into your program is useful.

You could embed the short hash (the first 7 characters) into the version string, for example v3.6.1-375ae31. Then you could compare it against the revision history to see exactly what your codebase looked like when your client's version was built.

For a while now I've wanted to do just this, and now I've finally figured out how to do it cross-platform. This post documents how I went about doing it, and how you can do it too.

The basic principle of the idea is to run a command that will output the latest commit hash to a file before the build starts, and then embed that file into the resulting binary. To achieve this, we need to go about it in 2 parts. Firstly, we need to fiddle with the project file to add an optional pre-build event. Open the project file (MyProject.csproj) in your favourite text editor (but preferably not your favourite IDE such as Visual Studio or MonoDevelop) and add this to the bottom, just before the closing </Project>:

<Target Name="BeforeBuild" BeforeTargets="Build">
    <Exec Command="git rev-parse HEAD &gt;git-hash.txt" WorkingDirectory="$(ProjectDir)" IgnoreExitCode="true" />
</Target>

If you don't use Git, then change git rev-parse HEAD &gt;git-hash.txt in the above to the equivalent command for your version control system. For SVN, this stackoverflow question looks like it'll do the job for Windows - for Linux you should go here.

Once done, the next step is to add the generated file as an embedded resource. We can't do this with the GUI easily here since the file in question hasn't been generated yet! Add the following to the bottom of the csproj file, again just before the </Project>:

<ItemGroup>
    <EmbeddedResource Include="git-hash.txt" />
</ItemGroup>

Remember to change the git-hash.txt to whatever you changed it to above.

Next, save it and reopen the solution in your IDE. The final step is to actually utilise the commit hash (or revision number in SVN) in your program. Since it's just an embedded file, you can simply find it with a bit of reflection, and read it in with a StreamReader. I've written a good tutorial on how to do that over on my Embedding files in C♯ binaries post.

Make sure that your program is prepared to handle junk instead of a commit hash - you can't predict the contents of the embedded file if Git (or SVN) isn't installed on the machine used to build your project. If you want to require that the commit hash (or revision number) is actually present, just remove the IgnoreExitCode="true" from the first snippet above.

Sources

Use C♯ 6.0 today in Visual Studio 2013

A colourful microchip banner. (Banner from hdw.eweb4.com)

In case you haven't heard, C♯ 6.0 is here now and it's awesome (here's a cheat sheet from programmingwithmosh.com showing the most noteable new features). Unfortunately, you must be using either Visual Studio 2015 or above or MonoDevelop in order to take advantage of it.... until now: Microsoft have released their C♯ 6.0 compiler Roslyn as a NuGet package.

If you don't know what a NuGet package is, Nuget is a modular system that allows you to pull in and use various different libraries and tools automatically. There's a central registry over at nuget.org, which people (like you!) can upload their packages to and other people can download them from. This looks like a good tutorial for Windows users. MonoDevelop users need to install this addin, but it should be installed already.

All you have to do is install the Microsoft.Net.Compilers NuGet package in order to use C♯ 6.0 in Visual Studio 2013. That's it! Unfortunately, this breaks the build process on platforms other than Windows, as the MicroSoft.Net.Compilers package is Windows only. The solution is fairly simple however. Once you've installed the above NuGet package, open your ".csproj" file in your favourite plain text editor (such as Notepad or gedit), and find the line that looks like this (it should be near the bottom):

<Import Project="..\packages\Microsoft.Net.Compilers.1.3.2\build\Microsoft.Net.Compilers.props" Condition="Exists('..\packages\Microsoft.Net.Compilers.1.3.2\build\Microsoft.Net.Compilers.props')" />

And add AND '$(OS)' == 'Windows_NT' to the end of the Condition attribute like this:

  <Import Project="..\packages\Microsoft.Net.Compilers.1.3.2\build\Microsoft.Net.Compilers.props" Condition="Exists('..\packages\Microsoft.Net.Compilers.1.3.2\build\Microsoft.Net.Compilers.props') AND '$(OS)' == 'Windows_NT'" />

The above adds a condition that prevents the compiler in NuGet package you installed from being used on platforms other than Windows. This doesn't mean that you can't use C♯ 6.0 on other platforms - Mono (the Linux c♯ compiler) already supports C♯ 6.0 natively, so it doesn't need to be replaced. it's just the C♯ compiler bundled with Visual Studio 2013 and below that's no good.

Splitting your C♯ Code into Multiple Files 2: DLLs

I have found out about compiling and linking to DLLs. I think this is called Dynamic Linking but again, I could be wrong.

We will be using the example files from last time:

filea.cs:

using System;

class ClassA
{
    public static void Main()
    {
        Console.WriteLine("This is a test from file A");
        Someplace.ClassB.PrintHello();
    }
}

fileb.cs:

using System;

namespace Someplace
{
    public class ClassB
    {
        public static void PrintHello()
        {
            Console.WriteLine("Another hello from file B!");
        }
    }
}

This is a 2 step process. First we need to compile the DLL, then we need to compile the main exe and link to the DLL.

To compile the DLL, you type something like this:

csc /target:library fileb.cs

The important bit here is /target:library. This tells the C♯ compiler to compile your code to a DLL and not and exe.

To compile the main exe, you need to type something like this:

csc /reference:fileb.dll filea.cs

This tells the C♯ compiler to compile the code in filea.cs into an exe, and link it to fileb.dll.

Taken from this MSDN page.

Splitting your C♯ Code into Multiple Files

I have just started to work out how to split my C♯ code into multiple files, and thought that I would share it with you. This post will be about what I believe to be static linking, but I could be wrong. Anyway, it is actually quite simple:

Here is the contents of filea.cs:

using System;

class ClassA
{
    public static void Main()
    {
        Console.WriteLine("This is a test from file A");
        Someplace.ClassB.PrintHello();
    }
}

and here is the contents of fileb.cs:

using System;

namespace Someplace
{
    class ClassB
    {
        public static void PrintHello()
        {
            Console.WriteLine("Another hello from file B!");
        }
    }
}

Then when you compile, you should do something like this:

csc filea.cs fileb.cs

This will tell the C Sharp compiler to grab both filea.cs and fileb.cs, and to output a single filea.exe.

Next I will try to figure out how to create a .dll file and include that - then I can build my own libraries.

Art by Mythdael