com.swabunga.spell.engine

Class EditDistance

public class EditDistance extends Object

This class is based on Levenshtein Distance algorithms, and it calculates how similar two words are. If the words are identical, then the distance is 0. The more that the words have in common, the lower the distance value. The distance value is based on how many operations it takes to get from one word to the other. Possible operations are swapping characters, adding a character, deleting a character, and substituting a character. The resulting distance is the sum of these operations weighted by their cost, which can be set in the Configuration object. When there are multiple ways to convert one word into the other, the lowest cost distance is returned.
Another way to think about this: what are the cheapest operations that would have to be done on the "original" word to end up with the "similar" word? Each operation has a cost, and these are added up to get the distance.

See Also: COST_REMOVE_CHAR COST_INSERT_CHAR COST_SUBST_CHARS COST_SWAP_CHARS

Field Summary
static Configurationconfig
Fetches the spell engine configuration properties.
Method Summary
static intgetDistance(String word, String similar)
Evaluates the distance between two words.
static intgetDistance(String word, String similar, int[][] matrix)
Evaluates the distance between two words.
static voidmain(String[] args)
For testing edit distances

Field Detail

config

public static Configuration config
Fetches the spell engine configuration properties.

Method Detail

getDistance

public static final int getDistance(String word, String similar)
Evaluates the distance between two words.

Parameters: word One word to evaluates similar The other word to evaluates

Returns: a number representing how easy or complex it is to transform on word into a similar one.

getDistance

public static final int getDistance(String word, String similar, int[][] matrix)
Evaluates the distance between two words.

Parameters: word One word to evaluates similar The other word to evaluates

Returns: a number representing how easy or complex it is to transform on word into a similar one.

main

public static void main(String[] args)
For testing edit distances

Parameters: args an array of two strings we want to evaluate their distances.

Throws: java.lang.Exception when problems occurs during reading args.