Class BlockTreeTermsReader
- java.lang.Object
-
- org.apache.lucene.index.Fields
-
- org.apache.lucene.codecs.FieldsProducer
-
- org.apache.lucene.codecs.blocktree.BlockTreeTermsReader
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,java.lang.Iterable<java.lang.String>
,Accountable
public final class BlockTreeTermsReader extends FieldsProducer
A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes. The terms index is a prefix trie whose leaves are term blocks. The advantage of this approach is that seekExact is often able to determine a term cannot exist without doing any IO, and intersection with Automata is very fast. Note that this terms dictionary has its own fixed terms index (ie, it does not support a pluggable terms index implementation).NOTE: this terms dictionary supports min/maxItemsPerBlock during indexing to control how much memory the terms index uses.
The data structure used by this implementation is very similar to a burst trie (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499), but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.
Use
CheckIndex
with the-verbose
option to see summary statistics on the blocks in the dictionary. SeeBlockTreeTermsWriter
.
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.List<java.lang.String>
fieldList
private java.util.Map<java.lang.String,FieldReader>
fieldMap
(package private) static Outputs<BytesRef>
FST_OUTPUTS
(package private) IndexInput
indexIn
(package private) static BytesRef
NO_OUTPUT
(package private) static int
OUTPUT_FLAG_HAS_TERMS
(package private) static int
OUTPUT_FLAG_IS_FLOOR
(package private) static int
OUTPUT_FLAGS_MASK
(package private) static int
OUTPUT_FLAGS_NUM_BITS
(package private) PostingsReaderBase
postingsReader
(package private) java.lang.String
segment
(package private) static java.lang.String
TERMS_CODEC_NAME
(package private) static java.lang.String
TERMS_EXTENSION
Extension of terms file(package private) static java.lang.String
TERMS_INDEX_CODEC_NAME
(package private) static java.lang.String
TERMS_INDEX_EXTENSION
Extension of terms index file(package private) static java.lang.String
TERMS_META_CODEC_NAME
(package private) static java.lang.String
TERMS_META_EXTENSION
Extension of terms meta file(package private) IndexInput
termsIn
(package private) int
version
static int
VERSION_COMPRESSED_SUFFIXES
Suffixes are compressed to save space.static int
VERSION_CURRENT
Current terms format.static int
VERSION_META_FILE
Metadata is written to its own file.static int
VERSION_META_LONGS_REMOVED
The long[] + byte[] metadata has been replaced with a single byte[].static int
VERSION_START
Initial terms format.-
Fields inherited from class org.apache.lucene.index.Fields
EMPTY_ARRAY
-
-
Constructor Summary
Constructors Constructor Description BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state)
Sole constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description (package private) java.lang.String
brToString(BytesRef b)
void
checkIntegrity()
Checks consistency of this reader.void
close()
java.util.Collection<Accountable>
getChildResources()
Returns nested resources of this class.java.util.Iterator<java.lang.String>
iterator()
Returns an iterator that will step through all fields names.long
ramBytesUsed()
Return the memory usage of this object in bytes.private static BytesRef
readBytesRef(IndexInput in)
private static void
seekDir(IndexInput input)
Seekinput
to the directory offset.int
size()
Returns the number of fields or -1 if the number of distinct field names is unknown.Terms
terms(java.lang.String field)
Get theTerms
for this field.java.lang.String
toString()
-
Methods inherited from class org.apache.lucene.codecs.FieldsProducer
getMergeInstance
-
-
-
-
Field Detail
-
NO_OUTPUT
static final BytesRef NO_OUTPUT
-
OUTPUT_FLAGS_NUM_BITS
static final int OUTPUT_FLAGS_NUM_BITS
- See Also:
- Constant Field Values
-
OUTPUT_FLAGS_MASK
static final int OUTPUT_FLAGS_MASK
- See Also:
- Constant Field Values
-
OUTPUT_FLAG_IS_FLOOR
static final int OUTPUT_FLAG_IS_FLOOR
- See Also:
- Constant Field Values
-
OUTPUT_FLAG_HAS_TERMS
static final int OUTPUT_FLAG_HAS_TERMS
- See Also:
- Constant Field Values
-
TERMS_EXTENSION
static final java.lang.String TERMS_EXTENSION
Extension of terms file- See Also:
- Constant Field Values
-
TERMS_CODEC_NAME
static final java.lang.String TERMS_CODEC_NAME
- See Also:
- Constant Field Values
-
VERSION_START
public static final int VERSION_START
Initial terms format.- See Also:
- Constant Field Values
-
VERSION_META_LONGS_REMOVED
public static final int VERSION_META_LONGS_REMOVED
The long[] + byte[] metadata has been replaced with a single byte[].- See Also:
- Constant Field Values
-
VERSION_COMPRESSED_SUFFIXES
public static final int VERSION_COMPRESSED_SUFFIXES
Suffixes are compressed to save space.- See Also:
- Constant Field Values
-
VERSION_META_FILE
public static final int VERSION_META_FILE
Metadata is written to its own file.- See Also:
- Constant Field Values
-
VERSION_CURRENT
public static final int VERSION_CURRENT
Current terms format.- See Also:
- Constant Field Values
-
TERMS_INDEX_EXTENSION
static final java.lang.String TERMS_INDEX_EXTENSION
Extension of terms index file- See Also:
- Constant Field Values
-
TERMS_INDEX_CODEC_NAME
static final java.lang.String TERMS_INDEX_CODEC_NAME
- See Also:
- Constant Field Values
-
TERMS_META_EXTENSION
static final java.lang.String TERMS_META_EXTENSION
Extension of terms meta file- See Also:
- Constant Field Values
-
TERMS_META_CODEC_NAME
static final java.lang.String TERMS_META_CODEC_NAME
- See Also:
- Constant Field Values
-
termsIn
final IndexInput termsIn
-
indexIn
final IndexInput indexIn
-
postingsReader
final PostingsReaderBase postingsReader
-
fieldMap
private final java.util.Map<java.lang.String,FieldReader> fieldMap
-
fieldList
private final java.util.List<java.lang.String> fieldList
-
segment
final java.lang.String segment
-
version
final int version
-
-
Constructor Detail
-
BlockTreeTermsReader
public BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state) throws java.io.IOException
Sole constructor.- Throws:
java.io.IOException
-
-
Method Detail
-
readBytesRef
private static BytesRef readBytesRef(IndexInput in) throws java.io.IOException
- Throws:
java.io.IOException
-
seekDir
private static void seekDir(IndexInput input) throws java.io.IOException
Seekinput
to the directory offset.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Specified by:
close
in classFieldsProducer
- Throws:
java.io.IOException
-
iterator
public java.util.Iterator<java.lang.String> iterator()
Description copied from class:Fields
Returns an iterator that will step through all fields names. This will not return null.
-
terms
public Terms terms(java.lang.String field) throws java.io.IOException
Description copied from class:Fields
Get theTerms
for this field. This will return null if the field does not exist.
-
size
public int size()
Description copied from class:Fields
Returns the number of fields or -1 if the number of distinct field names is unknown. If >= 0,Fields.iterator()
will return as many field names.
-
brToString
java.lang.String brToString(BytesRef b)
-
ramBytesUsed
public long ramBytesUsed()
Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.
-
getChildResources
public java.util.Collection<Accountable> getChildResources()
Description copied from interface:Accountable
Returns nested resources of this class. The result should be a point-in-time snapshot (to avoid race conditions).- See Also:
Accountables
-
checkIntegrity
public void checkIntegrity() throws java.io.IOException
Description copied from class:FieldsProducer
Checks consistency of this reader.Note that this may be costly in terms of I/O, e.g. may involve computing a checksum value against large data files.
- Specified by:
checkIntegrity
in classFieldsProducer
- Throws:
java.io.IOException
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-