Module anomaly

Module anomaly 

Source
Expand description

Anomaly detection window functions.

This module provides statistical anomaly scoring functions that operate as window UDFs (User Defined Window Functions):

  • anomaly_score_mad(value) — MAD-based scoring
  • anomaly_score_iqr(value, k) — IQR-based scoring with configurable fence multiplier
  • anomaly_score_zscore(value) — Z-Score-based scoring

These functions return a floating-point anomaly score rather than a boolean, allowing users to set their own threshold via WHERE score > N.

§Minimum Samples

Each function has its own minimum based on statistical validity:

Functionmin_samplesRationale
anomaly_score_zscore2stddev requires n >= 2 (aligned with STDDEV_SAMP)
anomaly_score_mad3n <= 2 makes MAD almost always 0, yielding spurious +inf
anomaly_score_iqr3linear-interpolated Q1 != Q3 is possible at n >= 3

§Return Values

ConditionzscoremadiqrResult
insufficient valid pointsn < 2n < 3n < 3NULL
stddev / MAD / IQR = 0, value = centerdistance = 0distance = 0on fence0.0
stddev / MAD / IQR = 0, value ≠ centerdistance > 0distance > 0outside fence+inf
normal casestddev > 0MAD > 0IQR > 0finite positive

§Window Frame Semantics

The functions score the current row in the partition, regardless of window frame type. This works correctly for all frame specifications:

-- Trailing window
anomaly_score_mad(cpu) OVER (ORDER BY ts ROWS 100 PRECEDING)
-- Centered window
anomaly_score_mad(cpu) OVER (ORDER BY ts ROWS BETWEEN 50 PRECEDING AND 50 FOLLOWING)

Internally, a row counter tracks which partition row is being evaluated. The range parameter determines only which rows participate in computing the window statistics (median, MAD, IQR, mean, stddev).

§Performance Notes

Current implementation uses per-row evaluation with O(N × W) complexity where N is the partition size and W is the window size. This is acceptable for typical window sizes (W ≤ a few thousand).

Future optimizations could include:

  • Incremental computation using order-statistic trees or two-heap median maintenance, reducing to O(N × log W)
  • Batch evaluate_all for fixed-size windows

Modules§

iqr 🔒
anomaly_score_iqr window function — IQR-based anomaly scoring.
mad 🔒
anomaly_score_mad window function — MAD-based anomaly scoring.
utils 🔒
Shared statistical utilities for anomaly detection window functions.
zscore 🔒
anomaly_score_zscore window function — Z-Score-based anomaly scoring.

Structs§

AnomalyFunction