BkTree
in package
A serialisable BK-Tree Implementation.
Ref: https://nullwords.wordpress.com/2013/03/13/the-bk-tree-a-data-structure-for-spell-checking/
Table of Contents
- __construct() : mixed
- add() : int
- Adds a string to the tree.
- clear() : void
- close() : void
- Saves changes to the tree back to disk.
- edit_distance() : int
- A utility function for calculating edit distance.
- get_costs() : stdClass
- Get the current levenshtein costs.
- lookup() : array<string|int, string>
- Generator that walks the BK-Tree and iteratively yields results.
- remove() : bool
- Removes a string from the tree.
- set_costs() : void
- Set the levenshtien insert/delete/replace costs.
- stats() : array<string|int, mixed>
- Calculate statistics about the BK-Tree.
- trace() : array<string|int, mixed>
- walk() : Generator<string|int, stdClass>
- Iteratively walks the BkTree.
Methods
__construct()
public
__construct(string $filename, string $seed_word) : mixed
Parameters
- $filename : string
- $seed_word : string
Return values
mixed —add()
Adds a string to the tree.
public
add(string $string, int $starting_node_id) : int
Note that duplicates can be added if you're not careful!
Parameters
- $string : string
-
The string to add.
- $starting_node_id : int
-
The id fo node to start insertion from. Defaults to 0 - for internal use only.
Return values
int —The depth at which the new node was added.
clear()
public
clear() : void
Return values
void —close()
Saves changes to the tree back to disk.
public
close() : void
Return values
void —edit_distance()
A utility function for calculating edit distance.
public
edit_distance(string $a, string $b) : int
Warning: Do not use this internally! It is slow. It's much faster to do this directly. This exists only for external use.
Parameters
- $a : string
-
The first string.
- $b : string
-
The second string to compare against.
Return values
int —The computed edit distance.
get_costs()
Get the current levenshtein costs.
public
get_costs() : stdClass
Return values
stdClass —The current levenshtein insert/delete/replace costs.
lookup()
Generator that walks the BK-Tree and iteratively yields results.
public
lookup(string $string[, int $max_distance = 1 ], int $count) : array<string|int, string>
Note that the returned array is not sorted.
Parameters
- $string : string
-
The search string.
- $max_distance : int = 1
-
The maximum edit distance to search.
- $count : int
-
The number of results to return. 0 = All results found. Note that results will be in a random order.
Return values
array<string|int, string> —Similar resultant strings from the BK-Tree.
remove()
Removes a string from the tree.
public
remove(string $string) : bool
BUG: If this deletes the root node, then it's all over and it will crash
Parameters
- $string : string
-
The string to remove.
Return values
bool —Whether the removal was successful.
set_costs()
Set the levenshtien insert/delete/replace costs.
public
set_costs(int $insert, int $delete, int $replace) : void
Note that if these values change, the entire tree needs to be rebuilt.
Parameters
- $insert : int
-
The insert cost.
- $delete : int
-
The cost to delete a character.
- $replace : int
-
The cost to replace a character.
Return values
void —stats()
Calculate statistics about the BK-Tree.
public
stats() : array<string|int, mixed>
Useful for analysing a tree's structure. If the tree isn't balanced, you may need to insert items in a different order.
Return values
array<string|int, mixed> —An array of statistics about this BK-Tree.
trace()
public
trace(string $string) : array<string|int, mixed>
Parameters
- $string : string
Return values
array<string|int, mixed> —walk()
Iteratively walks the BkTree.
public
walk() : Generator<string|int, stdClass>
Warning: This is slow
Return values
Generator<string|int, stdClass> —A generator that iteratively walks the tree and yields every item therein that's connected to the root node.