org.apache.lucene.index

Class IndexReader

Known Direct Subclasses:
FilterIndexReader, MultiReader

public abstract class IndexReader
extends java.lang.Object

IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.

Concrete subclasses of IndexReader are usually constructed with a call to the static method open.

For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.

Version:
$Id: IndexReader.java,v 1.32 2004/04/21 16:46:30 goller Exp $

Author:
Doug Cutting

Constructor Summary

IndexReader(Directory directory)
Constructor used if IndexReader is not owner of its directory.

Method Summary

void
close()
Closes files associated with this index.
protected void
commit()
Commit changes resulting from delete, undeleteAll, or setNorm operations
void
delete(int docNum)
Deletes the document numbered docNum.
int
delete(Term term)
Deletes all documents containing term.
Directory
directory()
Returns the directory this index resides in.
protected void
doClose()
Implements close.
protected void
doCommit()
Implements commit.
protected void
doDelete(int docNum)
Implements deletion of the document numbered docNum.
protected void
doSetNorm(int doc, String field, byte value)
Implements setNorm in subclass.
protected void
doUndeleteAll()
Implements actual undeleteAll() in subclass.
int
docFreq(Term t)
Returns the number of documents containing the term t.
Document
document(int n)
Returns the stored fields of the nth Document in this index.
protected void
finalize()
Release the write lock, if needed.
static long
getCurrentVersion(File directory)
Reads version number from segments files.
static long
getCurrentVersion(String directory)
Reads version number from segments files.
static long
getCurrentVersion(Directory directory)
Reads version number from segments files.
Collection
getFieldNames()
Returns a list of all unique field names that exist in the index pointed to by this IndexReader.
Collection
getFieldNames(boolean indexed)
Returns a list of all unique field names that exist in the index pointed to by this IndexReader.
Collection
getIndexedFieldNames(boolean storedTermVector)
TermFreqVector
getTermFreqVector(int docNumber, String field)
Return a term frequency vector for the specified document and field.
TermFreqVector[]
getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.
boolean
hasDeletions()
Returns true if any documents have been deleted
static boolean
indexExists(File directory)
Returns true if an index exists at the specified directory.
static boolean
indexExists(String directory)
Returns true if an index exists at the specified directory.
static boolean
indexExists(Directory directory)
Returns true if an index exists at the specified directory.
boolean
isDeleted(int n)
Returns true if document n has been deleted
static boolean
isLocked(String directory)
Returns true iff the index in the named directory is currently locked.
static boolean
isLocked(Directory directory)
Returns true iff the index in the named directory is currently locked.
static long
lastModified(File directory)
Deprecated. Replaced by getCurrentVersion(File)
static long
lastModified(String directory)
Deprecated. Replaced by getCurrentVersion(String)
static long
lastModified(Directory directory)
Deprecated. Replaced by getCurrentVersion(Directory)
int
maxDoc()
Returns one greater than the largest possible document number.
byte[]
norms(String field)
Returns the byte-encoded normalization factor for the named field of every document.
void
norms(String field, byte[] bytes, int offset)
Reads the byte-encoded normalization factor for the named field of every document.
int
numDocs()
Returns the number of documents in this index.
static IndexReader
open(File path)
Returns an IndexReader reading the index in an FSDirectory in the named path.
static IndexReader
open(String path)
Returns an IndexReader reading the index in an FSDirectory in the named path.
static IndexReader
open(Directory directory)
Returns an IndexReader reading the index in the given Directory.
void
setNorm(int doc, String field, byte value)
Expert: Resets the normalization factor for the named field of the named document.
void
setNorm(int doc, String field, float value)
Expert: Resets the normalization factor for the named field of the named document.
TermDocs
termDocs()
Returns an unpositioned TermDocs enumerator.
TermDocs
termDocs(Term term)
Returns an enumeration of all the documents which contain term.
TermPositions
termPositions()
Returns an unpositioned TermPositions enumerator.
TermPositions
termPositions(Term term)
Returns an enumeration of all the documents which contain term.
TermEnum
terms()
Returns an enumeration of all the terms in the index.
TermEnum
terms(Term t)
Returns an enumeration of all terms after a given term.
void
undeleteAll()
Undeletes all documents currently marked as deleted in this index.
static void
unlock(Directory directory)
Forcibly unlocks the index in the named directory.

Constructor Details

IndexReader

protected IndexReader(Directory directory)
Constructor used if IndexReader is not owner of its directory. This is used for IndexReaders that are used within other IndexReaders that take care or locking directories.

Parameters:
directory - Directory where IndexReader files reside.

Method Details

close

public final void close()
            throws IOException
Closes files associated with this index. Also saves any new deletions to disk. No other methods should be called after this has been called.


commit

protected final void commit()
            throws IOException
Commit changes resulting from delete, undeleteAll, or setNorm operations


delete

public final void delete(int docNum)
            throws IOException
Deletes the document numbered docNum. Once a document is deleted it will not appear in TermDocs or TermPostitions enumerations. Attempts to read its field with the document(int) method will result in an error. The presence of this document may still be reflected in the docFreq(Term) statistic, though this will be corrected eventually as the index is further modified.


delete

public final int delete(Term term)
            throws IOException
Deletes all documents containing term. This is useful if one uses a document field to hold a unique ID string for the document. Then to delete such a document, one merely constructs a term with the appropriate field and the unique ID string as its text and passes it to this method. Returns the number of documents deleted.


directory

public Directory directory()
Returns the directory this index resides in.


doClose

protected void doClose()
            throws IOException
Implements close.


doCommit

protected void doCommit()
            throws IOException
Implements commit.


doDelete

protected void doDelete(int docNum)
            throws IOException


doSetNorm

protected void doSetNorm(int doc,
                         String field,
                         byte value)
            throws IOException
Implements setNorm in subclass.


doUndeleteAll

protected void doUndeleteAll()
            throws IOException
Implements actual undeleteAll() in subclass.


docFreq

public int docFreq(Term t)
            throws IOException
Returns the number of documents containing the term t.


document

public Document document(int n)
            throws IOException
Returns the stored fields of the nth Document in this index.


finalize

protected final void finalize()
            throws IOException
Release the write lock, if needed.


getCurrentVersion

public static long getCurrentVersion(File directory)
            throws IOException
Reads version number from segments files. The version number counts the number of changes of the index.

Parameters:
directory - where the index resides.

Returns:
version number.


getCurrentVersion

public static long getCurrentVersion(String directory)
            throws IOException
Reads version number from segments files. The version number counts the number of changes of the index.

Parameters:
directory - where the index resides.

Returns:
version number.


getCurrentVersion

public static long getCurrentVersion(Directory directory)
            throws IOException
Reads version number from segments files. The version number counts the number of changes of the index.

Parameters:
directory - where the index resides.

Returns:
version number.


getFieldNames

public Collection getFieldNames()
            throws IOException
Returns a list of all unique field names that exist in the index pointed to by this IndexReader.

Returns:
Collection of Strings indicating the names of the fields


getFieldNames

public Collection getFieldNames(boolean indexed)
            throws IOException
Returns a list of all unique field names that exist in the index pointed to by this IndexReader. The boolean argument specifies whether the fields returned are indexed or not.

Parameters:
indexed - true if only indexed fields should be returned; false if only unindexed fields should be returned.

Returns:
Collection of Strings indicating the names of the fields


getIndexedFieldNames

public Collection getIndexedFieldNames(boolean storedTermVector)

Parameters:
storedTermVector - if true, returns only Indexed fields that have term vector info, else only indexed fields without term vector info

Returns:
Collection of Strings indicating the names of the fields


getTermFreqVector

public TermFreqVector getTermFreqVector(int docNumber,
                                        String field)
            throws IOException
Return a term frequency vector for the specified document and field. The vector returned contains terms and frequencies for those terms in the specified field of this document, if the field had storeTermVector flag set. If the flag was not set, the method returns null.

See Also:
Field.isTermVectorStored()


getTermFreqVectors

public TermFreqVector[] getTermFreqVectors(int docNumber)
            throws IOException
Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null.

See Also:
Field.isTermVectorStored()


hasDeletions

public boolean hasDeletions()
Returns true if any documents have been deleted


indexExists

public static boolean indexExists(File directory)
Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it.

Parameters:
directory - the directory to check for an index

Returns:
true if an index exists; false otherwise


indexExists

public static boolean indexExists(String directory)
Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it. false is returned.

Parameters:
directory - the directory to check for an index

Returns:
true if an index exists; false otherwise


indexExists

public static boolean indexExists(Directory directory)
            throws IOException
Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it.

Parameters:
directory - the directory to check for an index

Returns:
true if an index exists; false otherwise


isDeleted

public boolean isDeleted(int n)
Returns true if document n has been deleted


isLocked

public static boolean isLocked(String directory)
            throws IOException
Returns true iff the index in the named directory is currently locked.

Parameters:
directory - the directory to check for a lock


isLocked

public static boolean isLocked(Directory directory)
            throws IOException
Returns true iff the index in the named directory is currently locked.

Parameters:
directory - the directory to check for a lock


lastModified

public static long lastModified(File directory)
            throws IOException

Deprecated. Replaced by getCurrentVersion(File)

Returns the time the index in the named directory was last modified.

Synchronization of IndexReader and IndexWriter instances is no longer done via time stamps of the segments file since the time resolution depends on the hardware platform. Instead, a version number is maintained within the segments file, which is incremented everytime when the index is changed.


lastModified

public static long lastModified(String directory)
            throws IOException

Deprecated. Replaced by getCurrentVersion(String)

Returns the time the index in the named directory was last modified.

Synchronization of IndexReader and IndexWriter instances is no longer done via time stamps of the segments file since the time resolution depends on the hardware platform. Instead, a version number is maintained within the segments file, which is incremented everytime when the index is changed.


lastModified

public static long lastModified(Directory directory)
            throws IOException

Deprecated. Replaced by getCurrentVersion(Directory)

Returns the time the index in the named directory was last modified.

Synchronization of IndexReader and IndexWriter instances is no longer done via time stamps of the segments file since the time resolution depends on the hardware platform. Instead, a version number is maintained within the segments file, which is incremented everytime when the index is changed.


maxDoc

public int maxDoc()
Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.


norms

public byte[] norms(String field)
            throws IOException
Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

See Also:
Field.setBoost(float)


norms

public void norms(String field,
                  byte[] bytes,
                  int offset)
            throws IOException
Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

See Also:
Field.setBoost(float)


numDocs

public int numDocs()
Returns the number of documents in this index.


open

public static IndexReader open(File path)
            throws IOException
Returns an IndexReader reading the index in an FSDirectory in the named path.


open

public static IndexReader open(String path)
            throws IOException
Returns an IndexReader reading the index in an FSDirectory in the named path.


open

public static IndexReader open(Directory directory)
            throws IOException
Returns an IndexReader reading the index in the given Directory.


setNorm

public final void setNorm(int doc,
                          String field,
                          byte value)
            throws IOException
Expert: Resets the normalization factor for the named field of the named document. The norm represents the product of the field's boost and its length normalization. Thus, to preserve the length normalization values when resetting this, one should base the new value upon the old.

See Also:
norms(String), Similarity.decodeNorm(byte)


setNorm

public void setNorm(int doc,
                    String field,
                    float value)
            throws IOException
Expert: Resets the normalization factor for the named field of the named document.

See Also:
norms(String), Similarity.decodeNorm(byte)


termDocs

public TermDocs termDocs()
            throws IOException
Returns an unpositioned TermDocs enumerator.


termDocs

public TermDocs termDocs(Term term)
            throws IOException
Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. Thus, this method implements the mapping:

    Term    =>    <docNum, freq>*

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.


termPositions

public TermPositions termPositions()
            throws IOException
Returns an unpositioned TermPositions enumerator.


termPositions

public TermPositions termPositions(Term term)
            throws IOException
Returns an enumeration of all the documents which contain term. For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

    Term    =>    <docNum, freq, <pos1, pos2, ... posfreq-1> >*

This positional information faciliates phrase and proximity searching.

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.


terms

public TermEnum terms()
            throws IOException
Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.


terms

public TermEnum terms(Term t)
            throws IOException
Returns an enumeration of all terms after a given term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.


undeleteAll

public final void undeleteAll()
            throws IOException
Undeletes all documents currently marked as deleted in this index.


unlock

public static void unlock(Directory directory)
            throws IOException
Forcibly unlocks the index in the named directory.

Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this index.


Copyright © 2000-2005 Apache Software Foundation. All Rights Reserved.