com.swabunga.spell.engine

Class SpellDictionaryHashMap

public class SpellDictionaryHashMap extends SpellDictionaryASpell

The SpellDictionaryHashMap holds the dictionary

This class is thread safe. Derived classes should ensure that this preserved.

There are many open source dictionary files. For just a few see: http://wordlist.sourceforge.net/

This dictionary class reads words one per line. Make sure that your word list is formatted in this way (most are).

Note that you must create the dictionary with a word list for the added words to persist.

Field Summary
protected HashtablemainDictionary
The hashmap that contains the word dictionary.
Constructor Summary
SpellDictionaryHashMap()
Dictionary Constructor.
SpellDictionaryHashMap(Reader wordList)
Dictionary Constructor.
SpellDictionaryHashMap(File wordList)
Dictionary convenience Constructor.
SpellDictionaryHashMap(File wordList, File phonetic)
Dictionary constructor that uses an aspell phonetic file to build the transformation table.
SpellDictionaryHashMap(File wordList, File phonetic, String phoneticEncoding)
Dictionary constructor that uses an aspell phonetic file to build the transformation table.
SpellDictionaryHashMap(Reader wordList, Reader phonetic)
Dictionary constructor that uses an aspell phonetic file to build the transformation table.
Method Summary
voidaddDictionary(File wordList)
Add words from a file to existing dictionary hashmap.
voidaddDictionary(Reader wordList)
Add words from a Reader to existing dictionary hashmap.
protected voidaddDictionaryHelper(BufferedReader in)
Adds to the existing dictionary from a word list file.
voidaddWord(String word)
Add a word permanently to the dictionary (and the dictionary file).
protected voidcreateDictionary(BufferedReader in)
Constructs the dictionary from a word list file.
ListgetWords(String code)
Returns a list of strings (words) for the code.
booleanisCorrect(String word)
Returns true if the word is correctly spelled against the current word list.
protected voidputWord(String word)
Allocates a word in the dictionary
protected voidputWordUnique(String word)
Allocates a word, if it is not already present in the dictionary.

Field Detail

mainDictionary

protected Hashtable mainDictionary
The hashmap that contains the word dictionary. The map is hashed on the doublemeta code. The map entry contains a LinkedList of words that have the same double meta code.

Constructor Detail

SpellDictionaryHashMap

public SpellDictionaryHashMap()
Dictionary Constructor.

Throws: java.io.IOException indicates a problem with the file system

SpellDictionaryHashMap

public SpellDictionaryHashMap(Reader wordList)
Dictionary Constructor.

Parameters: wordList The file containing the words list for the dictionary

Throws: java.io.IOException indicates problems reading the words list file

SpellDictionaryHashMap

public SpellDictionaryHashMap(File wordList)
Dictionary convenience Constructor.

Parameters: wordList The file containing the words list for the dictionary

Throws: java.io.FileNotFoundException indicates problems locating the words list file on the system java.io.IOException indicates problems reading the words list file

SpellDictionaryHashMap

public SpellDictionaryHashMap(File wordList, File phonetic)
Dictionary constructor that uses an aspell phonetic file to build the transformation table.

Parameters: wordList The file containing the words list for the dictionary phonetic The file to use for phonetic transformation of the wordlist.

Throws: java.io.FileNotFoundException indicates problems locating the file on the system java.io.IOException indicates problems reading the words list file

SpellDictionaryHashMap

public SpellDictionaryHashMap(File wordList, File phonetic, String phoneticEncoding)
Dictionary constructor that uses an aspell phonetic file to build the transformation table. Encoding is used for phonetic file only; default encoding is used for wordList

Parameters: wordList The file containing the words list for the dictionary phonetic The file to use for phonetic transformation of the wordlist. phoneticEncoding Uses the character set encoding specified

Throws: java.io.FileNotFoundException indicates problems locating the file on the system java.io.IOException indicates problems reading the words list or phonetic information

SpellDictionaryHashMap

public SpellDictionaryHashMap(Reader wordList, Reader phonetic)
Dictionary constructor that uses an aspell phonetic file to build the transformation table.

Parameters: wordList The file containing the words list for the dictionary phonetic The reader to use for phonetic transformation of the wordlist.

Throws: java.io.IOException indicates problems reading the words list or phonetic information

Method Detail

addDictionary

public void addDictionary(File wordList)
Add words from a file to existing dictionary hashmap. This function can be called as many times as needed to build the internal word list. Duplicates are not added.

Note that adding a dictionary does not affect the target dictionary file for the addWord method. That is, addWord() continues to make additions to the dictionary file specified in createDictionary()

Parameters: wordList a File object that contains the words, on word per line.

Throws: FileNotFoundException IOException

addDictionary

public void addDictionary(Reader wordList)
Add words from a Reader to existing dictionary hashmap. This function can be called as many times as needed to build the internal word list. Duplicates are not added.

Note that adding a dictionary does not affect the target dictionary file for the addWord method. That is, addWord() continues to make additions to the dictionary file specified in createDictionary()

Parameters: wordList a Reader object that contains the words, on word per line.

Throws: IOException

addDictionaryHelper

protected void addDictionaryHelper(BufferedReader in)
Adds to the existing dictionary from a word list file. If the word already exists in the dictionary, a new entry is not added.

Each word in the reader should be on a separate line.

Note: for whatever reason that I haven't yet looked into, the phonetic codes for a particular word map to a vector of words rather than a hash table. This is a drag since in order to check for duplicates you have to iterate through all the words that use the phonetic code. If the vector-based implementation is important, it may be better to subclass for the cases where duplicates are bad.

addWord

public void addWord(String word)
Add a word permanently to the dictionary (and the dictionary file).

This needs to be made thread safe (synchronized)

createDictionary

protected void createDictionary(BufferedReader in)
Constructs the dictionary from a word list file.

Each word in the reader should be on a separate line.

This is a very slow function. On my machine it takes quite a while to load the data in. I suspect that we could speed this up quite allot.

getWords

public List getWords(String code)
Returns a list of strings (words) for the code.

isCorrect

public boolean isCorrect(String word)
Returns true if the word is correctly spelled against the current word list.

putWord

protected void putWord(String word)
Allocates a word in the dictionary

Parameters: word The word to add

putWordUnique

protected void putWordUnique(String word)
Allocates a word, if it is not already present in the dictionary. A word with a different case is considered the same.

Parameters: word The word to add