com.swabunga.spell.engine

Class SpellDictionaryASpell

public abstract class SpellDictionaryASpell extends Object implements SpellDictionary

Container for various methods that any SpellDictionary will use. This class is based on the original Jazzy aspell port.

Derived classes will need words list files as spell checking reference. Words list file is a dictionary with one word per line. There are many open source dictionary files, see: http://wordlist.sourceforge.net/

You can choose words lists form aspell many differents languages dictionaries. To grab some, install aspell and the dictionaries you require. Then run aspell specifying the name of the dictionary and the words list file to dump it into, for example:

 aspell --master=fr-40 dump master > fr-40.txt
 
Note: the number following the language is the size indicator. A bigger number gives a more extensive language coverage. Size 40 is more than adequate for many usages.

For some languages, Aspell can also supply you with the phonetic file. On Windows, go into aspell data directory and copy the phonetic file corresponding to your language, for example the fr_phonet.dat for the fr language. The phonetic file should be in directory /usr/share/aspell on Unix.

See Also: GenericTransformator for information on phonetic files.

Field Summary
protected Transformatortf
The reference to a Transformator, used to transform a word into it's phonetic code.
Constructor Summary
SpellDictionaryASpell(File phonetic)
Constructs a new SpellDictionaryASpell
SpellDictionaryASpell(File phonetic, String encoding)
Constructs a new SpellDictionaryASpell
SpellDictionaryASpell(Reader phonetic)
Constructs a new SpellDictionaryASpell
Method Summary
StringgetCode(String word)
Returns the phonetic code representing the word.
ListgetSuggestions(String word, int threshold)
Returns a list of Word objects that are the suggestions to an incorrect word.
ListgetSuggestions(String word, int threshold, int[][] matrix)
Returns a list of Word objects that are the suggestions to an incorrect word.
protected abstract ListgetWords(String phoneticCode)
Returns a list of words that have the same phonetic code.
booleanisCorrect(String word)
Returns true if the word is correctly spelled against the current word list.

Field Detail

tf

protected Transformator tf
The reference to a Transformator, used to transform a word into it's phonetic code.

Constructor Detail

SpellDictionaryASpell

public SpellDictionaryASpell(File phonetic)
Constructs a new SpellDictionaryASpell

Parameters: phonetic The file to use for phonetic transformation of the words list. If phonetic is null, the the transformation uses DoubleMeta transformation.

Throws: java.io.IOException indicates problems reading the phonetic information

SpellDictionaryASpell

public SpellDictionaryASpell(File phonetic, String encoding)
Constructs a new SpellDictionaryASpell

Parameters: phonetic The file to use for phonetic transformation of the words list. If phonetic is null, the the transformation uses DoubleMeta transformation. encoding Uses the character set encoding specified

Throws: java.io.IOException indicates problems reading the phonetic information

SpellDictionaryASpell

public SpellDictionaryASpell(Reader phonetic)
Constructs a new SpellDictionaryASpell

Parameters: phonetic The Reader to use for phonetic transformation of the words list. If phonetic is null, the the transformation uses DoubleMeta transformation.

Throws: java.io.IOException indicates problems reading the phonetic information

Method Detail

getCode

public String getCode(String word)
Returns the phonetic code representing the word.

Parameters: word The word we want the phonetic code.

Returns: The value of the phonetic code for the word.

getSuggestions

public List getSuggestions(String word, int threshold)
Returns a list of Word objects that are the suggestions to an incorrect word.

This method is only needed to provide backward compatibility.

Parameters: word Suggestions for given misspelt word threshold The lower boundary of similarity to misspelt word

Returns: Vector a List of suggestions

See Also: (String, int, int[][])

getSuggestions

public List getSuggestions(String word, int threshold, int[][] matrix)
Returns a list of Word objects that are the suggestions to an incorrect word.

Parameters: word Suggestions for given misspelt word threshold The lower boundary of similarity to misspelt word matrix Two dimensional int array used to calculate edit distance. Allocating this memory outside of the function will greatly improve efficiency.

Returns: Vector a List of suggestions

getWords

protected abstract List getWords(String phoneticCode)
Returns a list of words that have the same phonetic code.

Parameters: phoneticCode The phonetic code common to the list of words

Returns: A list of words having the same phonetic code

isCorrect

public boolean isCorrect(String word)
Returns true if the word is correctly spelled against the current word list.