Friday, June 1, 2007

getSimilarityScore() - uses Levenshtein Distance between two strings

I came up with this while working for Microsoft developing the Vista
Weather gadget. It was to be used to handle mis-spelings and to
otherwise find strings that were as close as possible to what had been
typed in.

This is useful code that is good to have around. It is a Perfect
Example of what I am trying to accomplish with this blog - - an archive
of Shannon Norrell cool webdood code. I write so much of it that lots
of little gems like this fall through the cracks. Half the time I don't
even remember that I wrote something, let alone recall how it was
written.

Enjoy.

-shannon norrell

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>getSimilarityScore</title>
<script language="javascript" type="text/javascript">
String.prototype.getSimilarityScore = function(c){
// Uses Levenshtein Distance between two strings algorithm
var s, that=c, l = (s = this.split("")).length, t = (c =
c.split("")).length, i, j, m, n, dis, maxLen, retVal=.95;
if(!(l || t)) {
dis = Math.max(l, t);
} else {
for(var a = [], i = l + 1; i; a[--i] = [i]);
for(i = t + 1; a[0][--i] = i;);
for(i = -1, m = s.length; ++i < m;)
for(j = -1, n = c.length; ++j < n;)
a[(i *= 1) + 1][(j *= 1) + 1] = Math.min(a[i][j + 1] +
1, a[i + 1][j] + 1, a[i][j] + (s[i] != c[j]));
dis = a[l][t];
}
maxLen = Math.max( this.length, that.length );
if (maxLen > 0) {
retVal = (1 - dis/maxLen);
}
return retVal;
};

var foo="Newcastle,wa";
alert( foo.getSimilarityScore('castel') );

</script>
</head>
<body>

</body>
</html>

No comments: