Pepperminty Wiki Module API

BkTree
in package

A serialisable BK-Tree Implementation.

Ref: https://nullwords.wordpress.com/2013/03/13/the-bk-tree-a-data-structure-for-spell-checking/

Table of Contents

__construct()  : mixed
add()  : int
Adds a string to the tree.
clear()  : void
close()  : void
Saves changes to the tree back to disk.
edit_distance()  : int
A utility function for calculating edit distance.
get_costs()  : stdClass
Get the current levenshtein costs.
lookup()  : array<string|int, string>
Generator that walks the BK-Tree and iteratively yields results.
remove()  : bool
Removes a string from the tree.
set_costs()  : void
Set the levenshtien insert/delete/replace costs.
stats()  : array<string|int, mixed>
Calculate statistics about the BK-Tree.
trace()  : array<string|int, mixed>
walk()  : Generator<string|int, stdClass>
Iteratively walks the BkTree.

Methods

__construct()

public __construct(string $filename, string $seed_word) : mixed
Parameters
$filename : string
$seed_word : string
Return values
mixed

add()

Adds a string to the tree.

public add(string $string, int $starting_node_id) : int

Note that duplicates can be added if you're not careful!

Parameters
$string : string

The string to add.

$starting_node_id : int

The id fo node to start insertion from. Defaults to 0 - for internal use only.

Return values
int

The depth at which the new node was added.

edit_distance()

A utility function for calculating edit distance.

public edit_distance(string $a, string $b) : int

Warning: Do not use this internally! It is slow. It's much faster to do this directly. This exists only for external use.

Parameters
$a : string

The first string.

$b : string

The second string to compare against.

Return values
int

The computed edit distance.

get_costs()

Get the current levenshtein costs.

public get_costs() : stdClass
Return values
stdClass

The current levenshtein insert/delete/replace costs.

lookup()

Generator that walks the BK-Tree and iteratively yields results.

public lookup(string $string[, int $max_distance = 1 ], int $count) : array<string|int, string>

Note that the returned array is not sorted.

Parameters
$string : string

The search string.

$max_distance : int = 1

The maximum edit distance to search.

$count : int

The number of results to return. 0 = All results found. Note that results will be in a random order.

Return values
array<string|int, string>

Similar resultant strings from the BK-Tree.

remove()

Removes a string from the tree.

public remove(string $string) : bool

BUG: If this deletes the root node, then it's all over and it will crash

Parameters
$string : string

The string to remove.

Return values
bool

Whether the removal was successful.

set_costs()

Set the levenshtien insert/delete/replace costs.

public set_costs(int $insert, int $delete, int $replace) : void

Note that if these values change, the entire tree needs to be rebuilt.

Parameters
$insert : int

The insert cost.

$delete : int

The cost to delete a character.

$replace : int

The cost to replace a character.

Return values
void

stats()

Calculate statistics about the BK-Tree.

public stats() : array<string|int, mixed>

Useful for analysing a tree's structure. If the tree isn't balanced, you may need to insert items in a different order.

Return values
array<string|int, mixed>

An array of statistics about this BK-Tree.

trace()

public trace(string $string) : array<string|int, mixed>
Parameters
$string : string
Return values
array<string|int, mixed>

walk()

Iteratively walks the BkTree.

public walk() : Generator<string|int, stdClass>

Warning: This is slow

Return values
Generator<string|int, stdClass>

A generator that iteratively walks the tree and yields every item therein that's connected to the root node.

Search results