public class HyperLogLogPlus extends Object implements ICardinality
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40671.pdf
Brief HyperLogLog++ Overview Uses 64 bit hashing instead of 32 Has two representation modes: sparse and normal 'normal' is approximately the same as regular hyperloglog (still uses 64 bits) 'sparse' handles lower cardinality values with a highly accurate but poorly scaling strategy and leverages data compression to compete with 'normal' for as long as possible (sparse has the advantage on accuracy per unit of memory at low cardinality but quickly falls behind).Modifier and Type | Class and Description |
---|---|
static class |
HyperLogLogPlus.Builder |
protected static class |
HyperLogLogPlus.HyperLogLogPlusMergeException |
Constructor and Description |
---|
HyperLogLogPlus(int p)
This constructor disables the sparse set.
|
HyperLogLogPlus(int p,
int sp)
Basic constructor for creating a instance that supports sparse and normal
representations.
|
HyperLogLogPlus(int p,
int sp,
List<byte[]> deltaByteSet)
Constructor to support instances serialized with the legacy sparse
encoding scheme.
|
Modifier and Type | Method and Description |
---|---|
void |
addAll(HyperLogLogPlus other)
Add all the elements of the other set to this set.
|
long |
cardinality()
Gather the cardinality estimate from this estimator.
|
byte[] |
getBytes() |
protected RegisterSet |
getRegisterSet()
exposed for testing
|
protected int[] |
getSparseSet()
exposed for testing
|
ICardinality |
merge(ICardinality... estimators)
Merge this HLL++ with a bunch of others! The power of minions!
Most of the logic consists of case analysis about the state of this HLL++ and each one it wants to merge
with.
|
protected void |
mergeTempList()
Script-esque function that handles preparing to and executing merging the sparse set
and the temp list.
|
boolean |
offer(Object o)
Add data to estimator based on the mode it is in
|
boolean |
offerHashed(int hashedInt)
Offer the value as a hashed long value
|
boolean |
offerHashed(long hashedLong)
Offer the value as a hashed long value
|
int |
sizeof() |
public HyperLogLogPlus(int p)
p
- - the precision value for the normal setpublic HyperLogLogPlus(int p, int sp)
p
and
sp
define the precision of the Normal and Sparse set
representations for the data structure. p
must be a value
between 4 and sp
and sp
must be less than 32.p
- - the precision value for the normal setsp
- - the precision value for the sparse setpublic HyperLogLogPlus(int p, int sp, List<byte[]> deltaByteSet)
p
- - the precision value for the normal setsp
- - the precision value for the sparse setdeltaByteSet
- - a list of varint byte arrays encoded using a delta encoding schemepublic boolean offerHashed(long hashedLong)
ICardinality
offerHashed
in interface ICardinality
hashedLong
- - the hash of the item to offer to the estimatorpublic boolean offerHashed(int hashedInt)
ICardinality
offerHashed
in interface ICardinality
hashedInt
- - the hash of the item to offer to the estimatorpublic boolean offer(Object o)
offer
in interface ICardinality
o
- stream elementpublic long cardinality()
cardinality
in interface ICardinality
public int sizeof()
sizeof
in interface ICardinality
public byte[] getBytes() throws IOException
getBytes
in interface ICardinality
IOException
protected void mergeTempList()
public void addAll(HyperLogLogPlus other) throws HyperLogLogPlus.HyperLogLogPlusMergeException
other
- A compatible Hyperloglog++ instance (same p and sp)CardinalityMergeException
- if other is not compatibleHyperLogLogPlus.HyperLogLogPlusMergeException
public ICardinality merge(ICardinality... estimators) throws CardinalityMergeException
merge
in interface ICardinality
estimators
- the estimators to merge with this oneCardinalityMergeException
protected RegisterSet getRegisterSet()
protected int[] getSparseSet()
Copyright © 2019. All rights reserved.