Statistics Operators

These dataflow functions are intended for statistics use. By default they operate on the window of the chart that queries them.

They optionally take a duration argument.

Statistics operators are useful for calculating signal statistics and can be charted directly, used to drive annotations, or as building blocks for transformer pipelines for custom signal analysis.

Internally, statistics operators use online algorithms, such as Welford's for computing variance. They each need O(N) memory to maintain state.

count

Screenshot of component operatorsStatistics count

Returns the number of events in the query window.

import { useDataTransformer } from '@electricui/timeseries-react'
import { count } from '@electricui/dataflow'
 
const countDS = useDataTransformer(() => {
return count(dataSource)
})

geometricMean

Screenshot of component operatorsStatistics geometricMean

The Geometric Mean (Wikipedia) is often used with statistics and ratios.

Calculated as the "n-th root of the product of n numbers", i.e.

sqrt( a x b )
cbrt( a x b x c )
import { useDataTransformer } from '@electricui/timeseries-react'
import { geometricMean } from '@electricui/dataflow'
 
const gMeanDS = useDataTransformer(() => {
return geometricMean(dataSource)
})

harmonicMean

Screenshot of component operatorsStatistics harmonicMean

The Harmonic mean (Wikipedia) is used for correctly averaging rates or ratios, and calculating parallel equvielent resistance.

A useful property of the harmonic mean is reduced weight for large values and increased weights for smaller values.

3
----------
1 1 1
- + - + -
a b c
import { useDataTransformer } from '@electricui/timeseries-react'
import { harmonicMean } from '@electricui/dataflow'
 
const hMeanDS = useDataTransformer(() => {
return harmonicMean(dataSource)
})

histogram

Screenshot of component operatorsStatistics histogram

The histogram operator is a windowed aggregation function that buckets data based on a fixed number of buckets, and returns the collection of the counts in each bucket per event.

The min and max arguments specify the range of data expected, data outside this range will be ignored.

numBuckets determines how many buckets within this range will be used.

The accessor allows picking the specific piece of data in the event to aggregate. The operator also takes an optional duration, but defaults to the window of the chart that queried the operator.

import { useDataTransformer } from '@electricui/timeseries-react'
import { histogram } from '@electricui/dataflow'
 
const dataTransformer = useDataTransformer(() => {
// 100 buckets of values between 0 and 200
const histo = histogram(dataSource, 0, 200, 100)
 
return histo
})

interquartileRange

Screenshot of component operatorsStatistics interquartileRange

Interquartile Range (IQR, Wikipedia) describes the spread of data represented by the middle 50% of percentiles.

This transform can be used to represent the spread of your signal without impact from outliers, or as part of custom statistics calculations and visualisations.

Our implementation uses Reservoir Sampling to mantain a bounded memory footprint.

import { useDataTransformer } from '@electricui/timeseries-react'
import { interquartileRange } from '@electricui/dataflow'
 
const iqrDS = useDataTransformer(() => {
return interquartileRange(dataSource)
})

max

Screenshot of component operatorsStatistics max

Returns the maximum value found in the query window.

Useful for drawing realtime annotations on charts.

import { useDataTransformer } from '@electricui/timeseries-react'
import { max } from '@electricui/dataflow'
 
const maxDS = useDataTransformer(() => {
return max(dataSource)
})

min

Screenshot of component operatorsStatistics min

Returns the minimum value found in the query window.

Useful for drawing realtime annotations on charts.

import { useDataTransformer } from '@electricui/timeseries-react'
import { min } from '@electricui/dataflow'
 
const minDS = useDataTransformer(() => {
return min(dataSource)
})

mean

Screenshot of component operatorsStatistics mean

Returns the arithmetic mean (Wikipedia) value of events across query window. This is what most people would call the average.

This is calculated by summing all values then dividing the sum by the number of values.

a + b + c
-----------
3
import { useDataTransformer } from '@electricui/timeseries-react'
import { mean } from '@electricui/dataflow'
 
const averageDS = useDataTransformer(() => {
return mean(dataSource)
})

meanAbsoluteDeviation

Screenshot of component operatorsStatistics meanAbsoluteDeviation

The Mean Absolute Deviation (AAD, or also called MAD), is the mean of the values deviating around the mean.

This is done by calculating the mean of the data, calculating the absolute difference between each value and the mean, then calculating the mean of the variation values.

Our implementation uses Reservoir Sampling to mantain a bounded memory footprint.

Commonly used to describe the variablility of a signal.

import { useDataTransformer } from '@electricui/timeseries-react'
import { meanAbsoluteDeviation } from '@electricui/dataflow'
 
const meanAbsDevDS = useDataTransformer(() => {
return meanAbsoluteDeviation(dataSource)
})

median

Screenshot of component operatorsStatistics median

Returns the middle value across the window of data.

This is calculated internally by finding the value representing the quartile 0.5.

import { useDataTransformer } from '@electricui/timeseries-react'
import { median } from '@electricui/dataflow'
 
const medianDS = useDataTransformer(() => {
return median(dataSource)
})

medianAbsoluteDeviation

Screenshot of component operatorsStatistics medianAbsoluteDeviation

The Median Absolute Deviation (MAD), represents the middle value in the range of deviations from the median.

This calculated by finding data's median value, then storing the absolute difference between each value and the median (in an AVLTree), then finding the median value of deviation values.

Our implementation uses Reservoir Sampling to mantain a bounded memory footprint.

Commonly used to describe signal variability, and can reject outliers more effectively than the standard deviation.

import { useDataTransformer } from '@electricui/timeseries-react'
import { medianAbsoluteDeviation } from '@electricui/dataflow'
 
const medianAbsDevDS = useDataTransformer(() => {
return medianAbsoluteDeviation(dataSource)
})

mode

Screenshot of component operatorsStatistics mode

Returns a collection of values which occured most often in the query window.

We do not guarantee the results will be sorted.

import { useDataTransformer } from '@electricui/timeseries-react'
import { mode } from '@electricui/dataflow'
 
const modeDS = useDataTransformer(() => {
return mode(dataSource)
})

If needed, consider chaining applicable operators such as min or max to return a single value in the event of multiple mode values.

product

Returns the product of all values in the query window.

import { useDataTransformer } from '@electricui/timeseries-react'
import { product } from '@electricui/dataflow'
 
const multipliedDS = useDataTransformer(() => {
return product(dataSource)
})

quantile

Screenshot of component operatorsStatistics quantile

Returns the value of a sample matching the user-specified probability.

Quantile accepts a number argument p between 0.0 and 1.0.

This implementation is Type 7 (h = (N − 1)p + 1), matching Julia and R's default settings, and also found in Excel as PERCENTILE, Python's numpy.percentile().

import { useDataTransformer } from '@electricui/timeseries-react'
import { quantile } from '@electricui/dataflow'
 
const quantileDS = useDataTransformer(() => {
return quantile(dataSource, 0.8)
})

Our implementation uses Reservoir Sampling to mantain a bounded memory footprint. We internally implement this with an AVLTree

quantiles

Returns an array of samples matching the user-requested probabilities.

Quantile accepts an argument ps with an array of numbers between 0.0 and 1.0.

This implementation is Type 7 (h = (N − 1)p + 1), matching Julia and R's default settings, and also found in Excel as PERCENTILE, Python's numpy.percentile().

import { useDataTransformer } from '@electricui/timeseries-react'
import { quantiles } from '@electricui/dataflow'
 
const quantilesDS = useDataTransformer(() => {
return quantiles(dataSource, [0.1, 0.4, 0.6, 0.9])
})

Our implementation uses Reservoir Sampling to mantain a bounded memory footprint.

rootMeanSquare

Screenshot of component operatorsStatistics rootMeanSquare

Returns the RMS value for the queried window of data.

Calculated by summing the square of each value, dividing by the number of values, then taking the square root.

import { useDataTransformer } from '@electricui/timeseries-react'
import { rootMeanSquare } from '@electricui/dataflow'
 
const rmsDS = useDataTransformer(() => {
return rootMeanSquare(dataSource)
})

sampleStandardDeviation

Screenshot of component operatorsStatistics sampleStandardDeviation

Returns the standard deviation of the input signal, assuming the window is a representative sample of the population (n - 1).

import { useDataTransformer } from '@electricui/timeseries-react'
import { sampleStandardDeviation } from '@electricui/dataflow'
 
const rmsDS = useDataTransformer(() => {
return sampleStandardDeviation(dataSource)
})

sampleVariance

Screenshot of component operatorsStatistics sampleVariance

Returns the squared deviation from the mean, assuming the window is a representative sample of the population (n - 1).

Be aware that the result's units will be input units squared!

For example: The example motor speed tracking signal used above shows a noisy band of +-200RPM. It has a resulting variance just over 10,000.

import { useDataTransformer } from '@electricui/timeseries-react'
import { sampleVariance } from '@electricui/dataflow'
 
const sampleVarianceDS = useDataTransformer(() => {
return sampleVariance(dataSource)
})

standardDeviation

Screenshot of component operatorsStatistics standardDeviation

Returns the standard deviation of the input signal, where the window's data represents the entire population (n).

Our implementation uses Welford's Method

import { useDataTransformer } from '@electricui/timeseries-react'
import { standardDeviation } from '@electricui/dataflow'
 
const stdevDS = useDataTransformer(() => {
return standardDeviation(dataSource)
})

sum

Screenshot of component operatorsStatistics sum

Returns the sum total of values across the query window.

import { useDataTransformer } from '@electricui/timeseries-react'
import { sum } from '@electricui/dataflow'
 
const sumDS = useDataTransformer(() => {
return sum(dataSource)
})

variance

Screenshot of component operatorsStatistics variance

Returns the variance of the input signal, where the window's data represents the entire population (n).

Be aware that the result's units will be input units squared!

For example: The motor speed tracking signal used above shows a noisy band of +-200RPM. It has a resulting variance just over 10,000.

import { useDataTransformer } from '@electricui/timeseries-react'
import { variance } from '@electricui/dataflow'
 
const varianceDS = useDataTransformer(() => {
return variance(dataSource)
})