Archive

## Tag Cloud

3d account algorithms announcement archives arduino artificial intelligence assembly async audio bash batch blog bookmarklet booting c sharp c++ challenge chrome os code codepen coding conundrums coding conundrums evolved command line compiling css dailyprogrammer debugging demystification distributed computing downtime embedded systems encryption es6 features event experiment external first impressions future game github github gist graphics hardware hardware meetup holiday html html5 html5 canvas interfaces internet io.js jabber javascript js bin labs learning library linux low level lua maintenance network networking node.js operating systems performance photos php pixelbot portable privacy programming problems project projects prolog protocol protocols pseudo 3d python reddit reference release releases resource review rust secrets security series list server servers software sorting source code control statistics svg technical terminal textures three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control visual web website windows windows 10 xmpp

## Profiling PHP with XDebug

(This post is a fork of a draft version of a tutorial / guide originally written as a whilst at my internship.)

Since I've been looking into xdebug's profiling function recently, I've just been tasked with writing up a guide on how to set it up and use it, from start to finish - and I thought I'd share it here.

While I've written about xdebug before in my An easier way to debug PHP post, I didn't end up covering the profiling function - I had difficulty getting it to work properly. I've managed to get it working now - this post documents how I did it. While this is written for a standard Debian server, the instructions can easily be applied to other servers.

For the uninitiated, xdebug is an extension to PHP that aids in the debugging of PHP code. It consists of 2 parts: The php extension on the server, and a client built into your editor. With these 2 parts, you can create breakpoints, step through code and more - though these functions are not the focus of this post.

To start off, you need to install xdebug. SSH into your web server with a sudo-capable account (or just use root, though that's bad practice!), and run the following command:

sudo apt install php-debug


Windows users will need to download it from here and put it in their PHP extension direction. Users of other linux distributions and windows may need to enable xdebug in their php.ini file manually (windows users will need extension=xdebug.dll; linux systems use extension=xdebug.so instead).

Once done, xdebug should be loaded and working correctly. You can verify this by looking the php information page. To see this page, put the following in a php file and request it in your browser:

<?php
phpinfo();
?>

If it's been enabled correctly, you should see something like this somewhere on the resulting page:

With xdebug setup, we can now begin configuring it. Xdebug gets configured in php.ini, PHP's main configuration file. Under Virtualmin each user has their own php.ini because PHP is loaded via CGI, and it's usually located at ~/etc/php.ini. To find it on your system, check the php information page as described above - there should be a row with the name "Loaded Configuration File":

Once you've located your php.ini file, open it in your favourite editor (or type sensible-editor php.ini if you want to edit over SSH), and put something like this at the bottom:

[xdebug]
xdebug.remote_enable=1
xdebug.remote_connect_back=1
xdebug.remote_port=9000
xdebug.remote_handler=dbgp
xdebug.remote_mode=req
xdebug.remote_autostart=true

xdebug.profiler_enable=false
xdebug.profiler_enable_trigger=true
xdebug.profiler_enable_trigger_value=ZaoEtlWj50cWbBOCcbtlba04Fj
xdebug.profiler_output_dir=/tmp
xdebug.profiler_output_name=php.profile.%p-%u

Obviously, you'll want to customise the above. The xdebug.profiler_enable_trigger_value directive defines a secret key we'll use later to turn profiling on. If nothing else, make sure you change this! Profiling slows everything down a lot, and could easily bring your whole server down if this secret key falls into the wrong hands (that said, simply having xdebug loaded in the first place slows things down too, even if you're not using it - so you may want to set up a separate server for development work that has xdebug installed if you haven't already). If you're not sure on what to set it to, here's a bit of bash I used to generate my random password:

dd if=/dev/urandom bs=8 count=4 status=none | base64 |  tr -d '=' | tr '+/' '-_'


The xdebug.profiler_output_dir lets you change the folder that xdebug saves the profiling output files to - make sure that the folder you specify here is writable by the user that PHP is executing as. If you've got a lot of profiling to do, you may want to consider changing the output filename, since xdebug uses a rather unhelpful filename by default. The property you want to change here is xdebug.profiler_output_name - and it supports a number of special % substitutions, which are documented here. I can recommend something phpprofile.%t-%u.%p-%H.%R.%cachegrind - it includes a timestamp and the request uri for identification purposes, while still sorting chronologically. Remember that xdebug will overwrite the output file if you don't include something that differentiates it from request to request!

With the configuration done, we can now move on to actually profiling something :D This is actually quite simple. Simply add the XDEBUG_PROFILE GET (or POST!) parameter to the url that you want to test in your browser. Here are some examples:

https://localhost/carrots/moon-iter.php?XDEBUG_PROFILE=ZaoEtlWj50cWbBOCcbtlba04Fj
https://development.galacticaubergine.de/register?vegetable=yes&mode=plus&XDEBUG_PROFILE=ZaoEtlWj50cWbBOCcbtlba04Fj

Adding this parameter to a request will cause xdebug to profile that request, and spit out a cachegrind file according to the settings we configured above. This file can then be analysed in your favourite editor, or, if it doesn't have support, an external program like qcachegrind (Windows) or kcachegrind (Everyone else).

If you need to profile just a single AJAX request or similar, most browsers' developer tools let you copy a request as a curl or wget command (Chromium-based browsers, Firefox - has an 'edit and resend' option), allowing you to resend the request with the XDEBUG_PROFILE GET parameter.

If you need to profile everything - including all subrequests (only those that pass through PHP, of course) - then you can set the XDEBUG_PROFILE parameter as a cookie instead, and it will cause profiling to be enabled for everything on the domain you set it on. Here's a [bookmarklet]() that set the cookie:

javascript:(function(){document.cookie='XDEBUG_PROFILE='+'insert_secret_key_here'+';expires=Mon, 05 Jul 2100 00:00:00 GMT;path=/;';})();

(Source)

Replace insert_secret_key_here with the secret key you created for the xdebug.profiler_enable_trigger_value property in your php.ini file above, create a new bookmark in your browser, paste it in (making sure that your browser doesn't auto-remove the javascript: at the beginning), and then click on it when you want to enable profiling.

## How to update your linux kernel version on a KimSufi server

(Or why PHP throws random errors in the latest update)

Hello again!

Since I had a bit of a time trying to find some clear information on the subject, I'm writing the blog post so that it might help others. Basically, yesterday night I updated the packages on my server (the one that runs this website!). There was a PHP update, but I didn't think much of it.

This morning, I tried to access my ownCloud instance, only to discover that it was throwing random errors and refusing to load. I'm running PHP version 7.0.16-1+deb.sury.org~xenial+2. It was spewing errors like this one:

PHP message: PHP Fatal error:  Uncaught Exception: Could not gather sufficient random data in /srv/owncloud/lib/private/Security/SecureRandom.php:80
Stack trace:
#0 /srv/owncloud/lib/private/Security/SecureRandom.php(80): random_int(0, 63)
#1 /srv/owncloud/lib/private/AppFramework/Http/Request.php(484): OC\Security\SecureRandom->generate(20)
#2 /srv/owncloud/lib/private/Log/Owncloud.php(90): OC\AppFramework\Http\Request->getId()
#3 [internal function]: OC\Log\Owncloud::write('PHP', 'Uncaught Except...', 3)
#4 /srv/owncloud/lib/private/Log.php(298): call_user_func(Array, 'PHP', 'Uncaught Except...', 3)
#5 /srv/owncloud/lib/private/Log.php(156): OC\Log->log(3, 'Uncaught Except...', Array)
#6 /srv/owncloud/lib/private/Log/ErrorHandler.php(67): OC\Log->critical('Uncaught Except...', Array)
#7 [internal function]: OC\Log\ErrorHandler::onShutdown()
#8 {main}
thrown in /srv/owncloud/lib/private/Security/SecureRandom.php on line 80" while reading response header from upstream, client: x.y.z.w, server: ownc


That's odd. After googling around a bit, I found this page on the Arch Linux bug tracker. I'm not using arch (Ubuntu 16.04.2 LTS actually), but it turned out that this comment shed some much-needed light on the problem.

Basically, PHP have changed the way they ask the Linux Kernel for random bytes. They now use the getrandom() kernel function instead of /dev/urandom as they did before. The trouble is that getrandom() was introduced in linux 3.17, and I was running OVH's custom 3.14.32-xxxx-grs-ipv6-64 kernel.

Thankfully, after a bit more digging, I found this article. It suggests installing the kernel you want and moving one of the grub config file generators to another directory, but I found that simply changing the permissions did the trick.

Basically, I did the following:


apt update
apt install linux-image-generic
chmod -x /etc/grub.d/06_OVHkernel
update-grub
reboot


Basically, the above first updates everything on the system. Then it installs the linux-image-generic package. linux-image-generic is the pseudo-package that always depends on the latest stable kernel image available.

Next, I remove execute privileges on the file /etc/grub.d/06_OVHkernel. This is the file that gives the default installed OVH kernel priority over any other instalaled kernels, so it's important to exclude it from the grub configuration process.

Lastly, I update my grub configuration with update-grub and then reboot. You need to make sure that you update your grub configuration file, since if you don't it'll still use the old OVH kernel!

With that all done, I'm now running 4.4.0-62-generic according to uname -a. If follow these steps yourself, make sure you have a backup! While I am happy to try and help you out in the comments below, I'm not responsible for any consequences that may arise as a result of following this guide :-)

## An easier way to debug PHP

Recently at my internship I've been writing quite a bit of PHP. The language itself is OK (I mean it does the job), but it's beginning to feel like a relic of a bygone era - especially when it comes to debugging. Up until recently I've been stuck with using echo() and var_dump() calls all over the place in order to figure out what's going on in my code - that's the equivalent of debugging your C♯ ACW with Console.WriteLine() O.o

Thankfully, whilst looking for an alternative, I found xdebug. Xdebug is like visual studio's debugging tools for C♯ (albeit a more primitive form). They allow you to add breakpoints and step though your PHP code one line at a time - inspecting the contents of variables in both the local and global scope as you go. It improves the standard error messages generated by PHP, too - adding stack traces and colour to the output in order to make it much more readable.

Best of all, I found a plugin for my primary web development editor atom. It's got some simple (ish) instructions on how to set up xdebug too - it didn't take me long to figure out how to put it to use.

I'll assume you've got PHP and Nginx already installed and configured, but this tutorial looks good (just skip over the MySQL section) if you haven't yet got it installed. This should work for other web servers and configurations too, but make sure you know where your php.ini lives.

XDebug consists of 2 components: The PHP extension for the server, and the client that's built into your editor. Firstly, you need to install the server extension. I've recorded an asciicast (terminal recording) to demonstrate the process:

(Above: An asciinema recording demonstrating how to install xdebug. Can't see it? Try viewing it on asciinema.org.)

If you're having trouble, make sure that your server can talk directly to your local development machine. If you're sitting behind any routers or firewalls, make sure they're configured to allow traffic though on port 9000 and configured to forward it on to your machine.

## Capturing and sending error reports by email in C♯

A month or two ago I put together a simple automatic updater and showed you how to do the same. This time I'm going to walk you through how to write your own error reporting system. It will be able to catch any errors through a try...catch block, ask the user whether they want to send an error report when an exception is thrown, and send the report to your email inbox.

Just like last time, this post is a starting point, not an ending point. It has a significant flaw and can be easily extended to, say, ask the user to write a short paragraph detailing what they were doing at the time of the crash, or add a proper gui, for example.

Please note that this tutorial requires that you have a server of some description to use to send the error reports to. If you want to get the system to send you an email too, you'll need a working mail server. Thankfully DigitalOcean provide free credit if you have the GitHub Student pack. This tutorial assumes that your mail server (or at least a relay one) is running on the same machine as your web server. While setting one up correctly can be a challenge, Lee Hutchinson over at Ars Technica has a great tutorial that's easy to follow.

To start with, we will need a suitable test program to work with whilst building this thing. Here's a good one riddled with holes that should throw more than a few exceptions:

using System;
using System.IO;
using System.Net;
using System.Text;

public class Program
{
public static readonly string Name = "Dividing program";
public static readonly string Version = "0.1";

public static string ProgramId
{
get { return string.Format("{0}/{1}", Name, Version); }
}

public static int Main(string[] args)
{
float a = 0, b = 0, c = 0;

Console.WriteLine(ProgramId);
Console.WriteLine("This program divides one number by another.");

Console.Write("Enter number 1: ");
Console.Write("Enter number 2: ");

c = a / b;
Console.WriteLine("Number 1 divided by number 2 is {0}.", c);

return 0;
}


There are a few redundant using statements at the top there - we will get to utilizing them later on.

First things first - we need to capture all exceptions and build an error report:

try
{
}
catch(Exception error)
{
Console.Write("Collecting data - ");
MemoryStream dataStream  = new MemoryStream();
StreamWriter dataIn = new StreamWriter(dataStream);
dataIn.WriteLine("***** Error Report *****");
dataIn.WriteLine(error.ToString());
dataIn.WriteLine();
dataIn.WriteLine("*** Details ***");
dataIn.WriteLine("a: {0}", a);
dataIn.WriteLine("b: {0}", b);
dataIn.WriteLine("c: {0}", c);
dataIn.Flush();

dataStream.Seek(0, SeekOrigin.Begin);
Console.WriteLine("done");
}

If you were doing this for real, it might be a good idea to move all of your application logic it it's own class and have a call like application.Run() instead of placing your code directly inside the try{ } block. Anyway, the above will catch the exception, and build a simple error report. I'm including the values of a few variables I created too. You might want to set up your own mechanism for storing state data so that the error reporting system can access it, like a special static class or something.

Now that we have created an error report, we need to send it to the server to processing. Before we do this, though, we ought to ask the user if this is ok with them (it is their computer in all likeliness after all!). This is easy:

Console.WriteLine("An error has occurred!");
Console.Write("Would you like to report it? [Y/n] ");

bool sendReport = false;
while(true)
{
if (key == ConsoleKey.Y) {
sendReport = true;
break;
}
else if (key == ConsoleKey.N)
break;
}
Console.WriteLine();

if(!sendReport)
{
Console.WriteLine("No report has been sent.");
Console.WriteLine("Press any key to exit.");
return 1;
}

Since this program uses the console, I'm continuing that trend here. You will need to create your own GUI if you aren't creating a console app.

Now that's taken care of, we can go ahead and send the report to the server. Here's how I've done it:

Console.Write("Sending report - ");
HttpWebRequest reportSender = WebRequest.CreateHttp("https://starbeamrainbowlabs.com/reportSender.php");
reportSender.Method = "POST";
reportSender.ContentType = "text/plain";
reportSender.UserAgent = ProgramId;
Stream requestStream = reportSender.GetRequestStream();
requestStream.Close();

WebResponse reportResponse = reportSender.GetResponse();
Console.WriteLine("done");
Console.WriteLine("Server response: {0}", ((HttpWebResponse)reportResponse).StatusDescription);
Console.WriteLine("Press any key to exit.");
return 1;

That may look unfamiliar and complicated, so let's walk through it one step at a time.

To start with, I create a new HTTP web request and point it at an address on my server. You will use a slightly different address, but the basic principle is the same. As for what resides at that address - we will take a look at that later on.

Next I set request method to be POST so that I can send some data to the server, and set a few headers to help the server out in understanding our request. Then I prepare the error report for transport and push it down the web request's request stream.

After that I get the response from the server and tell the user that we have finished sending the error report to the server.

That pretty much completes the client side code. Here's the whole thing from start to finish:

using System;
using System.IO;
using System.Net;
using System.Text;

public class Program
{
public static readonly string Name = "Dividing program";
public static readonly string Version = "0.1";

public static string ProgramId
{
get { return string.Format("{0}/{1}", Name, Version); }
}

public static int Main(string[] args)
{
float a = 0, b = 0, c = 0;

try
{
Console.WriteLine(ProgramId);
Console.WriteLine("This program divides one number by another.");

Console.Write("Enter number 1: ");
Console.Write("Enter number 2: ");

c = a / b;
Console.WriteLine("Number 1 divided by number 2 is {0}.", c);
}
catch(Exception error)
{
Console.WriteLine("An error has occurred!");
Console.Write("Would you like to report it? [Y/n] ");

bool sendReport = false;
while(true)
{
if (key == ConsoleKey.Y) {
sendReport = true;
break;
}
else if (key == ConsoleKey.N)
break;
}
Console.WriteLine();

if(!sendReport)
{
Console.WriteLine("No report has been sent.");
Console.WriteLine("Press any key to exit.");
return 1;
}

Console.Write("Collecting data - ");
MemoryStream dataStream  = new MemoryStream();
StreamWriter dataIn = new StreamWriter(dataStream);
dataIn.WriteLine("***** Error Report *****");
dataIn.WriteLine(error.ToString());
dataIn.WriteLine();
dataIn.WriteLine("*** Details ***");
dataIn.WriteLine("a: {0}", a);
dataIn.WriteLine("b: {0}", b);
dataIn.WriteLine("c: {0}", c);
dataIn.Flush();

dataStream.Seek(0, SeekOrigin.Begin);
Console.WriteLine("done");

Console.Write("Sending report - ");
HttpWebRequest reportSender = WebRequest.CreateHttp("https://starbeamrainbowlabs.com/reportSender.php");
reportSender.Method = "POST";
reportSender.ContentType = "text/plain";
reportSender.UserAgent = ProgramId;
Stream requestStream = reportSender.GetRequestStream();
requestStream.Close();

WebResponse reportResponse = reportSender.GetResponse();
Console.WriteLine("done");
Console.WriteLine("Server response: {0}", ((HttpWebResponse)reportResponse).StatusDescription);
Console.WriteLine("Press any key to exit.");
return 1;
}

return 0;
}
}


(Pastebin, Raw)

Next up is the server side code. Since I'm familiar with it and it can be found on all popular web servers, I'm going to be using PHP here. You could write this in ASP.NET, too, but I'm not familiar with it, nor do I have the appropriate environment set up at the time of posting (though I certainly plan on looking into it).

The server code can be split up into 3 sections: the settings, receiving and extending the error report, and sending the error report on in an email. Part one is quite straightforward:

<?php
/// Settings ///
$settings = new stdClass();$settings->fromAddress = "postasaurus@starbeamrainbowlabs.com";
$settings->toAddress = "bugs@starbeamrainbowlabs.com"; The above simply creates a new object and stores a few settings in it. I like to put settings at the top of small scripts like this because it both makes it easy to reconfigure them and allows for expansion later. Next we need to receive the error report from the client: // Get the error report from the client$errorReport = file_get_contents("php://input");

PHP on a web server it smarter than you'd think and collects some useful information about the connected client, so we can collect a few interesting statistics and tag them onto the end of the error report like this:

// Add some extra information to it
$errorReport .= "\n*** Server Information ***\n";$errorReport .= "Date / time reported: " . date("r") . "\n";
$errorReport .= "Reporting ip: " .$_SERVER['REMOTE_ADDR'] . "\n";
if(isset($_SERVER["HTTP_X_FORWARDED_FOR"])) {$errorReport .= "The error report was forwarded through a proxy.\n";
$errorReport .= "The proxy says that it forwarded the request from this address: " .$_SERVER['HTTP_X_FORWARDED_FOR'] . "\n\n";
}
if(isset($_SERVER["HTTP_USER_AGENT"])) {$errorReport .= "The reporting client identifies themselves as: " . $_SERVER["HTTP_USER_AGENT"] . ".\n"; } I'm adding the date and time here too just because the client could potentially fake it (they could fake everything, but that's a story for another time). I'm also collecting the client's user agent string too. This is being set in the client code above to the name and version of the program running. This information could be useful if you attach multiple programs to the same error reporting script. You could modify the client code to include the current .NET version, too by utilising Environment.Version. Lastly, since the report has gotten this far, we really should do something with it. I decided I wanted to send it to myself in an email, but you could just as easily store it in a file using something like file_put_contents("bug_reports.txt",$errorReport, FILE_APPEND);. Here's the code I came up with:

$emailHeaders = [ "From:$settings->fromAddress",
"Content-Type: text/plain",
"X-Mailer: PHP/" . phpversion()
];

$subject = "Error Report"; if(isset($_SERVER["HTTP_USER_AGENT"]))
$subject .= " from " .$_SERVER["HTTP_USER_AGENT"];

mail($settings->toAddress,$subject, $errorReport, implode("\r\n",$emailHeaders), "-t");

?>


That completes the server side code. Here's the completed script:

<?php
/// Settings ///
$settings = new stdClass();$settings->fromAddress = "postasaurus@starbeamrainbowlabs.com";
$settings->toAddress = "bugs@starbeamrainbowlabs.com"; // Get the error report from the client$errorReport = file_get_contents("php://input");

// Add some extra information to it
$errorReport .= "\n*** Server Information ***\n";$errorReport .= "Date / time reported: " . date("r") . "\n";
$errorReport .= "Reporting ip: " .$_SERVER['REMOTE_ADDR'] . "\n";
if(isset($_SERVER["HTTP_X_FORWARDED_FOR"])) {$errorReport .= "The error report was forwarded through a proxy.\n";
$errorReport .= "The proxy says that it forwarded the request from this address: " .$_SERVER['HTTP_X_FORWARDED_FOR'] . "\n\n";
}
if(isset($_SERVER["HTTP_USER_AGENT"])) {$errorReport .= "The reporting client identifies themselves as: " . $_SERVER["HTTP_USER_AGENT"] . ".\n"; }$emailHeaders = [
"From: $settings->fromAddress", "Content-Type: text/plain", "X-Mailer: PHP/" . phpversion() ];$subject = "Error Report";
if(isset($_SERVER["HTTP_USER_AGENT"]))$subject .= " from " . $_SERVER["HTTP_USER_AGENT"]; mail($settings->toAddress, $subject,$errorReport, implode("\r\n", $emailHeaders), "-t"); ?>  (Pastebin, Raw) The last job we need to do is to upload the PHP script to a PHP-enabled web server, and go back to the client and point it at the web address at which the PHP script is living. If you have read this far, then you've done it! You should have by this point a simple working error reporting system. Here's an example error report email that I got whilst testing it: ***** Error Report ***** System.FormatException: Input string was not in a correct format. at System.Number.ParseSingle (System.String value, NumberStyles options, System.Globalization.NumberFormatInfo numfmt) <0x7fe1c97de6c0 + 0x00158> in <filename unknown>:0 at System.Single.Parse (System.String s, NumberStyles style, System.Globalization.NumberFormatInfo info) <0x7fe1c9858690 + 0x00016> in <filename unknown>:0 at System.Single.Parse (System.String s) <0x7fe1c9858590 + 0x0001d> in <filename unknown>:0 at Program.Main (System.String[] args) <0x407d7d60 + 0x00180> in <filename unknown>:0 *** Details *** a: 4 b: 0 c: 0 *** Server Information *** Date / time reported: Mon, 11 Apr 2016 10:31:20 +0100 Reporting ip: 83.100.151.189 The reporting client identifies themselves as: Dividing program/0.1. I mentioned at the beginning of this post that that this approach has a flaw. The main problem lies in the fact that the PHP script can be abused by a knowledgeable attacker to send you lots of spam. I can't think of any real way to properly solve this, but I'd suggest storing the PHP script at a long and complicated URL that can't be easily guessed. There are probably other flaws as well, but I can't think of any at the moment. Found a mistake? Got an improvement? Please leave a comment below! ## Converting Hashtags into Titles with PHP Recently I have been working on a website for someone I know. Mythdael (the awesome artist who created a design for this website!) did the design for this one too, and I felt that I had to bring it to life. While I was writing it, I found that I needed to convert any given hashtag into a presentable title. Since hashtags are all one word (and usually lower case), it is more or less impossible to work out where one word starts and another ends. The solution: A wordlist (My choice was a modified enable1.txt). Here is the solution I came up with: /* * From https://terenceyim.wordpress.com/2011/02/01/all-purpose-binary-search-in-php/ * Parameters: *$a - The sort array.
*   $first - First index of the array to be searched (inclusive). *$last - Last index of the array to be searched (exclusive).
*   $key - The key to be searched for. *$compare - A user defined function for comparison. Same definition as the one in usort
*
* Return:
*   index of the search key if found, otherwise return (-insert_index - 1).
*   insert_index is the index of smallest element that is greater than $key or sizeof($a) if $key * is larger than all elements in the array. */ function binary_search(array$a, $first,$last, $key,$compare) {
$lo =$first;
$hi =$last - 1;

while ($lo <=$hi) {
$mid = (int)(($hi - $lo) / 2) +$lo;
$cmp = call_user_func($compare, $a[$mid], $key); if ($cmp < 0) {
$lo =$mid + 1;
} elseif ($cmp > 0) {$hi = $mid - 1; } else { return$mid;
}
}
return -($lo + 1); } class hashtag_parser { public$wordlist_length = 0;
public $wordlist = []; function __construct($wordlist_path) {
global $settings;$this->wordlist = file($wordlist_path, FILE_IGNORE_NEW_LINES);$this->wordlist_length = count($this->wordlist); } public function in_wordlist($word)
{
$word = strtolower($word);

$result = binary_search($this->wordlist, 0, $this->wordlist_length,$word, "strcmp");

if($result > -1) return true; else return false; /* if(in_array($word, $this->wordlist)) return true; else return false; */ } public function extract_words($hashtag)
{
global $settings; // Remove the hash from the beginning if it is present if(substr($hashtag, 0, 1) == "#") $hashtag = substr($hashtag, 1);

// Create an array to hold the words we find
$words = [];$length = strlen($hashtag); // Cache the length of the hashtag$pos = 0;
while($pos <$length)
{
//          echo("pos: $pos\n"); // aim: find the length of the longest substring that is a valid // word according to the wordlist$longest_word_length = 0;
for($scan_pos =$pos + 1; $scan_pos <$length + 1; $scan_pos++) { // echo("scan_pos:$scan_pos substring: " . substr($hashtag,$pos, $scan_pos -$pos) . "\n");
if($this->in_wordlist(substr($hashtag, $pos,$scan_pos - $pos))) {$longest_word_length = $scan_pos -$pos;
//                  echo("found word\n");
}
}

// Set the length of the longest word to the remainder of the
// string if we don't find any valid words
if($longest_word_length == 0)$longest_word_length = $length -$pos;

$words[] = substr($hashtag, $pos,$longest_word_length);

$pos +=$longest_word_length;
}

return ucwords(implode(" ", $words)); } }  The code is a bit messy (perhaps I should tidy it up a bit lot), but it does the job I intended it to do. I used a class because I was concerned that reading in the wordlist every time would cause the code to take too long to complete - it is slow enough as it is. Thankfully a binary search algorithm written in PHP by terenceyim helped speed things up enormously. Anyway, it can be used like this: $parser = new hashtag_parser("/path/to/wordlist.txt");
echo($parser->extract_words("sometext")); // Prints "Some Text" The algorithm I used is quite simple: 1. Loop over each character. 2. Scan ahead of the current character and figure out the length of the longest word in the input via the wordlist. 3. If no valid word can be found, assume that the rest of the input is all one word. 4. Extract the longest word we can find and add it to an array. 5. Add the length of the word we found to the character pointer. 6. If we haven't reached the end of the given input, go to step 2. 7. If we have reach the end of the output, return the words we found. I am posting this here in the hopes that someone else will find this code useful :) ## Probably the world's most advanced string splitting function When I was setting up this website, I foolishly picked a custom log file format that is rather hard for computers to parse. Because of this I haven't found a server log analysis tool that is intelligent enough to parse my logs (if you know of one please let me know in the comments!). Here is an example of a typical log file entry: [19/May/2015:00:47:52 +0100] "starbeamrainbowlabs.com" HTTP/1.1 GET 200 162.243.87.220 0s :443 /blog/article.php article=posts/013-Terminal-Reference.html "https://starbeamrainbowlabs.com/blog/?offset=60" "Mozilla/5.0 (compatible; spbot/4.4.2; +http://OpenLinkProfiler.org/bot )" It looks strange, doesn't it? Since I want to have some idea of how many people are visiting my site, I have finally gotten around to writing my own custom log parser. In order to do this, I needed a way to convert each line into an array of terms. None of the answers on stackoverflow seemed to cut it, so I wrote my own: <?php function explode_adv($openers, $closers,$togglers, $delimiters,$str)
{
$chars = str_split($str);
$parts = [];$nextpart = "";
$toggle_states = array_fill_keys($togglers, false); // true = now inside, false = now outside
$depth = 0; foreach($chars as $char) { if(in_array($char, $openers))$depth++;
elseif(in_array($char,$closers))
$depth--; elseif(in_array($char, $togglers)) { if($toggle_states[$char])$depth--; // we are inside a toggle block, leave it and decrease the depth
else
// we are outside a toggle block, enter it and increase the depth
$depth++; // invert the toggle block state$toggle_states[$char] = !$toggle_states[$char]; } else$nextpart .= $char; if($depth < 0) $depth = 0; if(in_array($char, $delimiters) &&$depth == 0 &&
!in_array($char,$closers))
{
$parts[] = substr($nextpart, 0, -1);
$nextpart = ""; } } if(strlen($nextpart) > 0)
$parts[] =$nextpart;

return $parts; } ?>  I have also posted this on stackoverflow. This function of mine takes 5 parameters: 1. An array of characters that open a block - e.g. [, (, etc. 2. An array of characters that close a block - e.g. ], ), etc. 3. An array of characters that toggle a block - e.g. ", ', etc. 4. An array of characters that should cause a split into the next part. 5. The string to work on. This function probably will have flaws, but it works well enough for me. You can also find this function on GitHub's Gist - as always suggestions and contributions are always welcome :) ## IP version tester You may have heard already - we have run out of IPv4 addresses. An IPv4 address is 32 bits long and looks like this: 37.187.192.179. If you count up all the possible combinations (considering each section may be between 0 and 255), missing out the addresses reserved for special purposes, you get about 3,706,452,992 addresses. The new system that the world is currently moving to (very slowly mind you) is called IPv6 and is 128 bits long. They look like this: 2001:41d0:52:a00::68e. This gives us a virtually unlimited supply of addresses so we should never run out. The problem is that the world is moving far too slowly over to it and you can never be sure if you have IPv6 connectivity or not. I built a quick IP version tester to solve this problem. I know there are others out there, but I wanted to build one myself :) You can find it here: Ip Version Tester. ## Finding Favicons with PHP There hasn't been a post here for a little while because I have been ill. I am back now though :) While writing more Bloworm, I needed a function that would automatically detect the url of the favicon that is associated with a given url. I wrote a quick function to do this a while ago - and have been improving it little by little. I now have it at a point where it finds the correct url 99% of the time, so I thought that I would share it with you. /* * @summary Given a url, this function will attempt to find it's correspending favicon. * * @returns The url of the corresponding favicon. */ function auto_find_favicon_url($url)
{
if(!validate_url($url)) senderror(new api_error(400, 520, "The url you specified for the favicon was invalid.")); // todo protect against downloading large files // todo send HEAD request instead of GET request try {$headers = get_headers($url, true); } catch (Exception$e) {
senderror(new api_error(502, 710, "Failed to fetch the headers from url: $url")); }$headers = array_change_key_case($headers);$urlparts = [];
preg_match("/^([a-z]+)\:(?:\/\/)?([^\/?#]+)(.*)/i", $url,$urlparts);

$content_type =$headers["content-type"];
if(!is_string($content_type)) // account for arrays of content types$content_type = $content_type[0];$faviconurl = "images/favicon-default.png";
if(strpos($content_type, "text/html") !== false) { try {$html = file_get_contents($url); } catch (Exception$e) {
senderror(new api_error(502, 711, "Failed to fetch url: $url")); }$matches = [];
if(preg_match("/rel=\"shortcut(?: icon)?\" (?:href=[\'\"]([^\'\"]+)[\'\"])/i", $html,$matches) === 1)
{
$faviconurl =$matches[1];
// make sure that the favicon url is absolute
if(preg_match("/^[a-z]+\:(?:\/\/)?/i", $faviconurl) === 0) { // the url is not absolute, make it absolute$basepath = dirname($urlparts[3]); // the path should not include the basepath if the favicon url begins with a slash if(substr($faviconurl, 0, 1) === "/")
{
$faviconurl = "$urlparts[1]://$urlparts[2]$faviconurl";
}
else
{
$faviconurl = "$urlparts[1]://$urlparts[2]$basepath/$faviconurl"; } } } } if($faviconurl == "images/favicon-default.png")
{
// we have not found the url of the favicon yet, parse the url
// todo guard against invalid urls

$faviconurl = "$urlparts[1]://$urlparts[2]/favicon.ico";$faviconurl = follow_redirects($faviconurl);$favheaders = get_headers($faviconurl, true);$favheaders = array_change_key_case($favheaders); if(preg_match("/2\d{3}/i",$favheaders[0]) === 0)
return $faviconurl; } return$faviconurl;
}

This code is pulled directly from the Bloworm source code - so you will need to edit it slightly to suit your needs. It is not perfect, and will probably will be updated from time to time.

## Following Redirects in PHP

Recently I have found that PHP sometimes doesn't follow redirects (e.g. the get_headers() function). So I wrote this quick function to follow a url's redirects to a certain depth:

/*
* @summary Follows a chain of redirects and returns that last url in the sequence.
*
* @param $url - The url to start at. * @param$maxdepth - The maximum depth to which to travel following redirects.
*
* @returns The url at the end of the redirect chain.
*/
function follow_redirects($url,$maxdepth = 10, $depth = 0) { //return the current url if we have hit the maximum depth if($depth >= $maxdepth) return$url;

$headers = get_headers($url, true);
$headers = array_change_key_case($headers);
//we have a redirect if the location header is set
if(isset($headers["location"])) { return follow_redirects($headers["location"], $maxdepth,$depth + 1);