Class PhoneticFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenFilterFactory
-
- org.apache.lucene.analysis.phonetic.PhoneticFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class PhoneticFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Factory forPhoneticFilter
. Create tokens based on phonetic encoders from Apache Commons Codec.This takes one required argument, "encoder", and the rest are optional:
- encoder
- required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex", "Caverphone" (v2.0), "ColognePhonetic" or "Nysiis" (case insensitive). If encoder isn't one of these, it'll be resolved as a class name either by itself if it already contains a '.' or otherwise as in the same package as these others.
- inject
- (default=true) add tokens to the stream with the offset=0
- maxCodeLength
- The maximum length of the phonetic codes, as defined by the encoder. If an encoder doesn't support this then specifying this is an error.
<fieldType name="text_phonetic" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/> </analyzer> </fieldType>
- Since:
- 3.1
- See Also:
PhoneticFilter
-
-
Field Summary
Fields Modifier and Type Field Description private java.lang.Class<? extends org.apache.commons.codec.Encoder>
clazz
static java.lang.String
ENCODER
parameter name: either a short name or a full class name(package private) boolean
inject
static java.lang.String
INJECT
parameter name: true if encoded tokens should be added as synonymsstatic java.lang.String
MAX_CODE_LENGTH
parameter name: restricts the length of the phonetic codeprivate java.lang.Integer
maxCodeLength
private java.lang.String
name
static java.lang.String
NAME
SPI nameprivate static java.lang.String
PACKAGE_CONTAINING_ENCODERS
private static java.util.Map<java.lang.String,java.lang.Class<? extends org.apache.commons.codec.Encoder>>
registry
private java.lang.reflect.Method
setMaxCodeLenMethod
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description PhoneticFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)
Creates a new PhoneticFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PhoneticFilter
create(TokenStream input)
Transform the specified input TokenStreamprotected org.apache.commons.codec.Encoder
getEncoder()
Must be thread-safe.void
inform(ResourceLoader loader)
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).private java.lang.Class<? extends org.apache.commons.codec.Encoder>
resolveEncoder(java.lang.String name, ResourceLoader loader)
-
Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
ENCODER
public static final java.lang.String ENCODER
parameter name: either a short name or a full class name- See Also:
- Constant Field Values
-
INJECT
public static final java.lang.String INJECT
parameter name: true if encoded tokens should be added as synonyms- See Also:
- Constant Field Values
-
MAX_CODE_LENGTH
public static final java.lang.String MAX_CODE_LENGTH
parameter name: restricts the length of the phonetic code- See Also:
- Constant Field Values
-
PACKAGE_CONTAINING_ENCODERS
private static final java.lang.String PACKAGE_CONTAINING_ENCODERS
- See Also:
- Constant Field Values
-
registry
private static final java.util.Map<java.lang.String,java.lang.Class<? extends org.apache.commons.codec.Encoder>> registry
-
inject
final boolean inject
-
name
private final java.lang.String name
-
maxCodeLength
private final java.lang.Integer maxCodeLength
-
clazz
private java.lang.Class<? extends org.apache.commons.codec.Encoder> clazz
-
setMaxCodeLenMethod
private java.lang.reflect.Method setMaxCodeLenMethod
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws java.io.IOException
Description copied from interface:ResourceLoaderAware
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
java.io.IOException
-
resolveEncoder
private java.lang.Class<? extends org.apache.commons.codec.Encoder> resolveEncoder(java.lang.String name, ResourceLoader loader)
-
getEncoder
protected org.apache.commons.codec.Encoder getEncoder()
Must be thread-safe.
-
create
public PhoneticFilter create(TokenStream input)
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-
-