Class Lucene90CompressingStoredFieldsWriter

java.lang.Object
org.apache.lucene.codecs.StoredFieldsWriter
org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsWriter
All Implemented Interfaces:
Closeable, AutoCloseable, Accountable

public final class Lucene90CompressingStoredFieldsWriter extends StoredFieldsWriter
  • Field Details

    • FIELDS_EXTENSION

      public static final String FIELDS_EXTENSION
      Extension of stored fields file
      See Also:
    • INDEX_EXTENSION

      public static final String INDEX_EXTENSION
      Extension of stored fields index
      See Also:
    • META_EXTENSION

      public static final String META_EXTENSION
      Extension of stored fields meta
      See Also:
    • INDEX_CODEC_NAME

      public static final String INDEX_CODEC_NAME
      Codec name for the index.
      See Also:
    • STRING

      static final int STRING
      See Also:
    • BYTE_ARR

      static final int BYTE_ARR
      See Also:
    • NUMERIC_INT

      static final int NUMERIC_INT
      See Also:
    • NUMERIC_FLOAT

      static final int NUMERIC_FLOAT
      See Also:
    • NUMERIC_LONG

      static final int NUMERIC_LONG
      See Also:
    • NUMERIC_DOUBLE

      static final int NUMERIC_DOUBLE
      See Also:
    • TYPE_BITS

      static final int TYPE_BITS
    • TYPE_MASK

      static final int TYPE_MASK
    • VERSION_START

      static final int VERSION_START
      See Also:
    • VERSION_CURRENT

      static final int VERSION_CURRENT
      See Also:
    • META_VERSION_START

      static final int META_VERSION_START
      See Also:
    • segment

      private final String segment
    • indexWriter

      private FieldsIndexWriter indexWriter
    • metaStream

      private IndexOutput metaStream
    • fieldsStream

      private IndexOutput fieldsStream
    • compressor

      private Compressor compressor
    • compressionMode

      private final CompressionMode compressionMode
    • chunkSize

      private final int chunkSize
    • maxDocsPerChunk

      private final int maxDocsPerChunk
    • bufferedDocs

      private final ByteBuffersDataOutput bufferedDocs
    • numStoredFields

      private int[] numStoredFields
    • endOffsets

      private int[] endOffsets
    • docBase

      private int docBase
    • numBufferedDocs

      private int numBufferedDocs
    • numChunks

      private long numChunks
    • numDirtyChunks

      private long numDirtyChunks
    • numDirtyDocs

      private long numDirtyDocs
    • numStoredFieldsInDoc

      private int numStoredFieldsInDoc
    • NEGATIVE_ZERO_FLOAT

      static final int NEGATIVE_ZERO_FLOAT
    • NEGATIVE_ZERO_DOUBLE

      static final long NEGATIVE_ZERO_DOUBLE
    • SECOND

      static final long SECOND
      See Also:
    • HOUR

      static final long HOUR
      See Also:
    • DAY

      static final long DAY
      See Also:
    • SECOND_ENCODING

      static final int SECOND_ENCODING
      See Also:
    • HOUR_ENCODING

      static final int HOUR_ENCODING
      See Also:
    • DAY_ENCODING

      static final int DAY_ENCODING
      See Also:
    • BULK_MERGE_ENABLED_SYSPROP

      static final String BULK_MERGE_ENABLED_SYSPROP
    • BULK_MERGE_ENABLED

      static final boolean BULK_MERGE_ENABLED
  • Constructor Details

  • Method Details

    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in class StoredFieldsWriter
      Throws:
      IOException
    • startDocument

      public void startDocument() throws IOException
      Description copied from class: StoredFieldsWriter
      Called before writing the stored fields of the document. StoredFieldsWriter.writeField(FieldInfo, IndexableField) will be called for each stored field. Note that this is called even if the document has no stored fields.
      Specified by:
      startDocument in class StoredFieldsWriter
      Throws:
      IOException
    • finishDocument

      public void finishDocument() throws IOException
      Description copied from class: StoredFieldsWriter
      Called when a document and all its fields have been added.
      Overrides:
      finishDocument in class StoredFieldsWriter
      Throws:
      IOException
    • saveInts

      private static void saveInts(int[] values, int length, DataOutput out) throws IOException
      Throws:
      IOException
    • writeHeader

      private void writeHeader(int docBase, int numBufferedDocs, int[] numStoredFields, int[] lengths, boolean sliced, boolean dirtyChunk) throws IOException
      Throws:
      IOException
    • triggerFlush

      private boolean triggerFlush()
    • flush

      private void flush(boolean force) throws IOException
      Throws:
      IOException
    • writeField

      public void writeField(FieldInfo info, IndexableField field) throws IOException
      Description copied from class: StoredFieldsWriter
      Writes a single stored field.
      Specified by:
      writeField in class StoredFieldsWriter
      Throws:
      IOException
    • writeZFloat

      static void writeZFloat(DataOutput out, float f) throws IOException
      Writes a float in a variable-length format. Writes between one and five bytes. Small integral values typically take fewer bytes.

      ZFloat --> Header, Bytes*?

      • Header --> Uint8. When it is equal to 0xFF then the value is negative and stored in the next 4 bytes. Otherwise if the first bit is set then the other bits in the header encode the value plus one and no other bytes are read. Otherwise, the value is a positive float value whose first byte is the header, and 3 bytes need to be read to complete it.
      • Bytes --> Potential additional bytes to read depending on the header.
      Throws:
      IOException
    • writeZDouble

      static void writeZDouble(DataOutput out, double d) throws IOException
      Writes a float in a variable-length format. Writes between one and five bytes. Small integral values typically take fewer bytes.

      ZFloat --> Header, Bytes*?

      • Header --> Uint8. When it is equal to 0xFF then the value is negative and stored in the next 8 bytes. When it is equal to 0xFE then the value is stored as a float in the next 4 bytes. Otherwise if the first bit is set then the other bits in the header encode the value plus one and no other bytes are read. Otherwise, the value is a positive float value whose first byte is the header, and 7 bytes need to be read to complete it.
      • Bytes --> Potential additional bytes to read depending on the header.
      Throws:
      IOException
    • writeTLong

      static void writeTLong(DataOutput out, long l) throws IOException
      Writes a long in a variable-length format. Writes between one and ten bytes. Small values or values representing timestamps with day, hour or second precision typically require fewer bytes.

      ZLong --> Header, Bytes*?

      • Header --> The first two bits indicate the compression scheme:
        • 00 - uncompressed
        • 01 - multiple of 1000 (second)
        • 10 - multiple of 3600000 (hour)
        • 11 - multiple of 86400000 (day)
        Then the next bit is a continuation bit, indicating whether more bytes need to be read, and the last 5 bits are the lower bits of the encoded value. In order to reconstruct the value, you need to combine the 5 lower bits of the header with a vLong in the next bytes (if the continuation bit is set to 1). Then zigzag-decode it and finally multiply by the multiple corresponding to the compression scheme.
      • Bytes --> Potential additional bytes to read depending on the header.
      Throws:
      IOException
    • finish

      public void finish(int numDocs) throws IOException
      Description copied from class: StoredFieldsWriter
      Called before StoredFieldsWriter.close(), passing in the number of documents that were written. Note that this is intentionally redundant (equivalent to the number of calls to StoredFieldsWriter.startDocument(), but a Codec should check that this is the case to detect the JRE bug described in LUCENE-1282.
      Specified by:
      finish in class StoredFieldsWriter
      Throws:
      IOException
    • copyOneDoc

      private void copyOneDoc(Lucene90CompressingStoredFieldsReader reader, int docID) throws IOException
      Throws:
      IOException
    • copyChunks

      private void copyChunks(MergeState mergeState, Lucene90CompressingStoredFieldsWriter.CompressingStoredFieldsMergeSub sub, int fromDocID, int toDocID) throws IOException
      Throws:
      IOException
    • merge

      public int merge(MergeState mergeState) throws IOException
      Description copied from class: StoredFieldsWriter
      Merges in the stored fields from the readers in mergeState. The default implementation skips over deleted documents, and uses StoredFieldsWriter.startDocument(), StoredFieldsWriter.writeField(FieldInfo, IndexableField), and StoredFieldsWriter.finish(int), returning the number of documents that were written. Implementations can override this method for more sophisticated merging (bulk-byte copying, etc).
      Overrides:
      merge in class StoredFieldsWriter
      Throws:
      IOException
    • tooDirty

      boolean tooDirty(Lucene90CompressingStoredFieldsReader candidate)
      Returns true if we should recompress this reader, even though we could bulk merge compressed data

      The last chunk written for a segment is typically incomplete, so without recompressing, in some worst-case situations (e.g. frequent reopen with tiny flushes), over time the compression ratio can degrade. This is a safety switch.

    • getMergeStrategy

      private Lucene90CompressingStoredFieldsWriter.MergeStrategy getMergeStrategy(MergeState mergeState, MatchingReaders matchingReaders, int readerIndex)
    • ramBytesUsed

      public long ramBytesUsed()
      Description copied from interface: Accountable
      Return the memory usage of this object in bytes. Negative values are illegal.