PHP Library to calculate and compare Nilsimsa digests.
The Nilsimsa hash is a locality senstive hash function. Generally similar documents will have similar Nilsimsa digests. The Hamming distance between the digests can be used to approximate the similarity between documents. For further information consult http://en.wikipedia.org/wiki/Nilsimsa_Hash and the references (particularly Damiani et al.)
Implementation details: The Nilsimsa class takes in a data parameter which is the string of the document to digest Calling the methods hexdigest() and digest() give the nilsimsa digests in hex or array format. The helper function compare_digests takes in two digests and computes the Nilsimsa score. You can also use compare_files() and compare_strings() to compare files and strings directly.
This code is a port of py-nilsimsa located at https://code.google.com/p/py-nilsimsa/
- The MIT License (MIT)
- lines_count() : mixed
lines_count(mixed $handle) : mixed
- $handle : mixed