org.apache.lucene.analysis.ru

Class RussianAnalyzer


public final class RussianAnalyzer
extends Analyzer

Analyzer for Russian language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.
Version:
$Id: RussianAnalyzer.java,v 1.7 2004/03/29 22:48:01 cutting Exp $
Author:
Boris Okner, b.okner@rogers.com

Constructor Summary

RussianAnalyzer()
RussianAnalyzer(char[] charset)
Builds an analyzer.
RussianAnalyzer(char[] charset, Hashtable stopwords)
Builds an analyzer with the given stop words.
RussianAnalyzer(char[] charset, String[] stopwords)
Builds an analyzer with the given stop words.

Method Summary

TokenStream
tokenStream(String fieldName, Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader.

Methods inherited from class org.apache.lucene.analysis.Analyzer

tokenStream, tokenStream

Constructor Details

RussianAnalyzer

public RussianAnalyzer()

RussianAnalyzer

public RussianAnalyzer(char[] charset)
Builds an analyzer.

RussianAnalyzer

public RussianAnalyzer(char[] charset,
                       Hashtable stopwords)
Builds an analyzer with the given stop words.
To Do:
create a Set version of this ctor

RussianAnalyzer

public RussianAnalyzer(char[] charset,
                       String[] stopwords)
Builds an analyzer with the given stop words.

Method Details

tokenStream

public TokenStream tokenStream(String fieldName,
                               Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader.
Overrides:
tokenStream in interface Analyzer
Returns:
A TokenStream build from a RussianLetterTokenizer filtered with RussianLowerCaseFilter, StopFilter, and RussianStemFilter

Copyright © 2000-2006 Apache Software Foundation. All Rights Reserved.