Struct metrics_util::Summary
source · pub struct Summary { /* private fields */ }
Expand description
A quantile sketch with relative-error guarantees.
Based on DDSketch, Summary
provides quantiles over an arbitrary distribution of
floating-point numbers, including for negative numbers, using a space-efficient sketch that
provides relative-error guarantees, regardless of the absolute range between the smallest and
larger values.
Summary
is similiar to HDRHistogram in practice, but supports an arbitrary
range of values, and supports floating-point numbers.
Numbers with an absolute value smaller than given min_value
will be recognized as zeroes.
Memory usage for Summary
should be nearly identical to DDSketch
.
Summary::estimated_size
provides a rough estimate of summary size based on the current
values that have been added to it.
As mentioned above, this sketch provides relative-error guarantees across quantiles falling within 0 <= q <= 1, but trades some accuracy at the lowest quantiles as part of the collapsing scheme that allows for automatically handling arbitrary ranges of values, even when the maximum number of bins has been allocated. Typically, q=0.05 and below is where this error will be noticed, if present.
For cases when all values are positive, you can simply use Summary::min
in lieu of checking
these quantiles, as the minimum value will be closer to the true value. For cases when values
range from negative to positive, the aforementioned collapsing will perturb the estimated true
value for quantiles that conceptually fall within this collapsed band.
For example, for a distribution that spans from -25 to 75, we would intuitively expect q=0 to be -25, q=0.25 to be 0, q=0.5 to be 25, and so on. Internally, negative numbers and positive numbers are handled in two separate containers. Based on this example, one container would handle -25 to 0, and another would handle the 0 to 75 range. As the containers are mapped “back to back”, q=0.25 for this hypothetical summary would actually be q=0 within the negative container, which may return an estimated true value that exceeds the relative error guarantees.
Of course, as these problems are related to the estimation aspect of this data structure, users can allow the summary to allocate more bins to compensate for these edge cases, if desired.
Implementations§
source§impl Summary
impl Summary
sourcepub fn new(alpha: f64, max_buckets: u32, min_value: f64) -> Summary
pub fn new(alpha: f64, max_buckets: u32, min_value: f64) -> Summary
Creates a new Summary
.
alpha
represents the desired relative error for this summary. If alpha
was 0.0001, that
would represent a desired relative error of 0.01%. For example, if the true value at
quantile q0 was 1, the estimated value at that quantile would be a value within 0.01% of the
true value, or a value between 0.9999 and 1.0001.
max_buckets
controls how many subbuckets are created, which directly influences memory usage.
Each bucket “costs” eight bytes, so a summary with 2048 buckets would consume a maximum of
around 16 KiB. Depending on how many samples have been added to the summary, the number of
subbuckets allocated may be far below max_buckets
, and the summary will allocate more as
needed to fulfill the relative error guarantee.
min_value
controls the smallest value that will be recognized distinctly from zero. Said
another way, any value between -min_value
and min_value
will be counted as zero.
sourcepub fn with_defaults() -> Summary
pub fn with_defaults() -> Summary
Creates a new Summary
with default values.
alpha
is 0.0001, max_buckets
is 32,768, and min_value
is 1.0e-9.
This will yield a summary that is roughly equivalent in memory usage to an HDRHistogram with 3 significant digits, and will support values down to a single nanosecond.
In practice, when using only positive values, maximum memory usage can be expected to hover around 200KiB, while usage of negative values can lead to an average maximum size of around 400KiB.
sourcepub fn add(&mut self, value: f64)
pub fn add(&mut self, value: f64)
Adds a sample to the summary.
If the absolute value of value
is smaller than given min_value
, it will be added as a zero.
sourcepub fn quantile(&self, q: f64) -> Option<f64>
pub fn quantile(&self, q: f64) -> Option<f64>
Gets the estimated value at the given quantile.
If the sketch is empty, or if the quantile is less than 0.0 or greater than 1.0, then the
result will be None
.
If the 0.0 or 1.0 quantile is requested, this function will return self.min() or self.max() instead of the estimated value.
sourcepub fn detailed_count(&self) -> (usize, usize, usize)
pub fn detailed_count(&self) -> (usize, usize, usize)
Gets the number of samples in this summary by zeroes, negative, and positive counts.
sourcepub fn estimated_size(&self) -> usize
pub fn estimated_size(&self) -> usize
Gets the estimized size of this summary, in bytes.
In practice, this value should be very close to the actual size, but will not be entirely precise.