Statically typing DataFlows using advanced TypeScript concepts

Michael

Electric UI utilises a streaming computation model for performing transformations on inbound data. We call this model DataFlow.

By expressing the transformation as an incremental computation on a stream of inputs, the transformation can be run across historical, static, and future incoming data. DataFlows can consume DataSources or other DataFlows as inputs, allowing for composition of complex transformations from simple operators such as map, filter and interleave. The computation cost is incremental and time sliceable. If the browser has an input pending, the computation may be paused and resumed once the main thread is idle. If data is added out of order, computation can restart, guaranteeing correctness.

To make this API as ergonomic as possible, DataFlows are statically typed with TypeScript to with as much type inference as feasible. Some of this typing involved non-trivial concepts that may be useful to others.

In this article we'll take a look at lookup types, the unknown and never type, overriding the type of class constructors, the infer declaration, conditional types, grabbing the inner type out of a generic, and forming unions from the inner members of an array of generic types.

A DataFlow

The following DataFlow combines two DataSources, one that provides XYZ position information, one that provides global state change information in the form of a color. The colorMixer DataFlow combines these XYZ positions with the latest state color.

type XYZEvent = {
x: number
y: number
z: number
}
type ColorEvent = {
color: string
}
type MixedEvent = XYZEvent & ColorEvent
function colorMixer(colorQueryable: Queryable<ColorEvent>, xyzQueryable: Queryable<XYZEvent>) {
let currentColorState = 'blue'
const colorSetter = forEach(colorQueryable, (data, time) => {
currentColorState = data.color
})
const colorXYZ = map(xyzQueryable, (data, time) => ({
x: data.x,
y: data.y,
z: data.z,
color: currentColorState,
}))
return interleave([colorXYZ, colorSetter])
}

Events are passed through the DataFlow in temporal order, modifying the currentColorState, emitting new mixed events when new positions are received.

┌─────────┐
│ │ ┌─────┐ ┌───┐
│ Color ├───►Green├────────────────────────►Red│
│ │ └──┬──┘ └─┬─┘
└─────────┘ │ │
│ │
│ │
┌──▼──┐ ┌─────┐ ┌─────┐ │ ┌─────┐
│Green│ │Green│ │Green│ │ │ Red │
│ 1 ├─────► 2 ├─────► 3 ├──┴──► 4 │
└──▲──┘ └──▲──┘ └──▲──┘ └──▲──┘
│ │ │ │
│ │ │ │
┌─────────┐ │ │ │ │
│ │ ┌┴┐ ┌┴┐ ┌┴┐ ┌┴┐
│ XYZ ├─────►1├─────────►2├─────────►3├─────────►4│
│ │ └─┘ └─┘ └─┘ └─┘
└─────────┘

Typing the MessageDataSource

Electric UI maintains a key value store of the current state of hardware. The key is the MessageID, the value can be any arbritrary data. This interface is declared globally per project.

// An easy way to declare zero-runtime-cost opaque types:
type TemperatureCelcius = number & { __temp_celcius: true }
declare global {
interface ElectricUIDeveloperState {
pub_time: number
quat: [x: number, y: number, z: number, w: number]
orient: [p: number, y: number, r: number]
lin_acc: [x: number, y: number, z: number]
ang_vel: [p: number, y: number, r: number]
baro: number // kPa
temp: TemperatureCelcius
// In general
[MessageID: string]: any
}
}

MessageDataSources are tied to specific keys via the MessageID:

import { MessageDataSource } from '@electricui/core-timeseries'
const temperature = new MessageDataSource('temp')
const runtimeSent = new MessageDataSource<number>('runtime')

Ideally, the temperature MessageDataSource correctly infers its type as MessageDataSource<TemperatureCelcius>. It is also desirable for type overrides to be available as an escape hatch, such as with runtimeSent.

The type of a key can be looked up using a lookup type.

interface Store {
test: 42
}
type TestType = Store['test']
type LookupKeyType<K extends keyof Store> = Store[K]

With a function this is relatively simple. However, the below doesn't work as well as one would hope, the union of possibilities is returned instead of the exact type.

interface Store {
test: 42
foo: string
}
const store: Store = {
test: 42,
foo: 'bar',
}
type TestType = Store['test']
function lookup(key: keyof Store): Store[typeof key] {
return store[key]
}
const res = lookup('test')
// ^?: string | 42

A generic type argument is required to have the type narrowed exactly.

// ...
function lookup<K extends keyof Store>(key: K): Store[K] {
return store[key]
}
const res = lookup('test')
// ^?: 42

Class constructor return types cannot be overriden, and it doesn't seem like this feature will be added any time soon.

As a result, the following does not work:

interface Store {
key: 42
}
class Container<M extends keyof Store> {
constructor(key: M): Container<Store[M]> {
// ^?
// Error: Type annotation cannot appear on a constructor declaration. (1093)
}
}

Instead, we alias the class and override the constructor function in a type assertion.

// Note the underscore
class _MessageDataSource<
T = unknown // the type of the events of this MessageID
> implements Queryable<T> {
constructor(public messageID: MessageID) {}
// ... trimmed for brevity
}
type MessageDataSource<T> = _MessageDataSource<T>
export const MessageDataSource = _MessageDataSource as {
new <
M extends keyof ElectricUIDeveloperState = keyof ElectricUIDeveloperState
>(
messageID: M,
): MessageDataSource<ElectricUIDeveloperState[M]>
}

A generic type argument, M is used to contain the messageID argument, which is used to extract the type from ElectricUIDeveloperState.

interface ElectricUIDeveloperState {
key: string
foo: 42
}
const test = new MessageDataSource('foo')
// ^?: MessageDataSource<42>

To support type overrides, an additional type argument T is allowed, defaulting to unknown. If it's unknown, the conditional type defaults to the above extraction. If it's any other type (including any), it provides the override.

// Note the underscore
class _MessageDataSource<
T = unknown, // the type of the events of this MessageID
M = keyof ElectricUIDeveloperState // the MessageID
> implements Queryable<T> {
constructor(public messageID: M & MessageID) {}
// ... trimmed for brevity
}
type MessageDataSource<T, M> = _MessageDataSource<T, M>
export const MessageDataSource = _MessageDataSource as {
new <
O = unknown,
M extends keyof ElectricUIDeveloperState = keyof ElectricUIDeveloperState
>(
messageID: M,
): MessageDataSource<unknown extends O ? ElectricUIDeveloperState[M] : O, M>
}

This results in the following behaviour:

interface ElectricUIDeveloperState {
temp: number
runtime: number
}
const temperature = new MessageDataSource('temp')
// ^?: MessageDataSource<number, 'temp'>
const runtimeSent = new MessageDataSource<number>('runtime')
// ^?: MessageDataSource<number, string>

Unfortunately Typescript doesn't support partial type inference. It isn't a huge deal in this case, it just results in the second type parameter defaulting to string when doing a type override.

Typing the DataFlow

DataFlows take an input and produce an output. They are composed from primitive operators, some of which are listed below with their type signature:

// Maps an event from one form to another
map<I, O>(queryable: Queryable<I>, mapper: (data: I, time: Time) => O): Queryable<O>
// Executes a closure for each event, potentially consuming it
forEach<I>(queryable: Queryable<I>, func: (data: I, time: Time) => void, consuming: true): Queryable<never>
forEach<I>(queryable: Queryable<I>, func: (data: I, time: Time) => void, consuming: false): Queryable<I>
// Filters incoming events based on a predicate function
filter<I>(queryable: Queryable<I>, predicate: (data: I, time: Time) => boolean): Queryable<I>

Internally the DataFlow keeps track of its inputs and outputs to provide type inference for callback functions, externally it only exposes its output type. All DataFlows alias themselves to Queryable<Output>.

We can use the infer operator and a conditional type to extract the inner type from a Queryable or DataFlow.

// For plain Queryables
type GetQueryableInner<T> = T extends Queryable<infer I> ? I : never
// For DataFlows and Queryables
type GetDataFlowInput<T> = T extends DataFlow<infer A, infer B>
? A // input if DataFlow
: T extends Queryable<infer C>
? C // inner type if Queryable
: never
type GetDataFlowOutput<T> = T extends DataFlow<infer A, infer B>
? B // output if DataFlow
: T extends Queryable<infer C>
? C // inner type if Queryable
: never

The forEach operator has an overload that determines if it consumes events without re-emitting them. This consumption can be represented with an output of the never type. In a union, the never type evaporates away, consumed by the other members.

forEach<I>(queryable: Queryable<I>, func: (data: I, time: Time) => void, consuming: true): Queryable<never>

This behaviour is useful with the interleave operator.

Interleave

The interleave operator combines multiple Queryables, ordering their events temporally.

Imagine the original DataFlow again:

function colorMixer(colorQueryable: Queryable<ColorEvent>, xyzQueryable: Queryable<XYZEvent>) {
let currentColorState = 'blue'
const colorSetter = forEach(colorQueryable, (data, time) => {
currentColorState = data.color
})
const colorXYZ = map(xyzQueryable, (data, time) => ({
x: data.x,
y: data.y,
z: data.z,
color: currentColorState,
}))
return interleave([colorXYZ, colorSetter])
}

The expected return type of this DataFlow would be:

colorMixer(colorQueryable: Queryable<ColorEvent>, xyzQueryable: Queryable<XYZEvent>): DataFlow<ColorEvent | XYZEvent, MixedEvent>

To achieve type inference of the members of the array of inputs to the interleave function, a generic type parameter Q is used, extending any array of Queryables.

function interleave<Q extends Queryable<any>[]>(
queryables: Q,
): DataFlow<GetDataFlowInput<Q[number]>, GetDataFlowOutput<Q[number]>> {}

An array can be indexed by numbers, so Q[number] gives us the union of Queryables (including their inner types), in this case:

Q[number] = Queryable<ColorEvent> | Queryable<XYZEvent>

The GetDataFlowInput and GetDataFlowOutput helpers can be used to extract the relevant inner types.

Finally, the naive typing for forEach would result in DataFlow<ColorEvent | XYZEvent, ColorEvent | MixedEvent>. However since the forEach in this case consumes, the output type is never, which is consumed by MixedEvent, resulting in the correct final type:

colorMixer(/* ... */): DataFlow<ColorEvent | XYZEvent, MixedEvent>

While DataFlows internally know both their inputs and outputs, they are presented as Queryables for downstream use, and as such the final signature is simply:

colorMixer(/* ... */): Queryable<MixedEvent>

Coalesce

The coalesce operator combines multiple Queryables into a keyed object. It emits new Events when any of its constituent members update. In the above color mixer implementation, Events are only emitted when the position changes. In the following coalesce based implementation, the DataFlow also emits events at the previous position if the color alone changes.

function colorMixer(colorQueryable: Queryable<ColorEvent>, xyzQueryable: Queryable<XYZEvent>) {
return coalesce({
x: map(xyzDataSource, data => data.x),
y: map(xyzDataSource, data => data.y),
z: map(xyzDataSource, data => data.z),
color: map(colorDataSource, data => data.colour),
})
})

Again the expected return type is a Queryable<MixedEvent>.

To achieve this, the coalesce function is generic over the object structure it receives.

function coalesce<S extends KeyedQueryables>(structure: S)

The KeyedQueryables type is a non-nested object with string keys and Queryable values:

type KeyedQueryables = {
[key: string]: DataFlow<any> | Queryable<any>
}

An additional helper type is created to extract the output values of an object of Queryables of the type KeyedQueryables.

type UnwrapKeyedQueryables<T extends KeyedQueryables> = {
[K in keyof T]: GetDataFlowOutput<T[K]>
}

Each key matches a key in the original structure, and each value is inferred using the conditional infer type, GetDataFlowOutput, created above.

This results in the output of:

Queryable<UnwrapKeyedQueryables<{
x: Queryable<number>;
y: Queryable<number>;
z: Queryable<number>;
color: Queryable<string>;
}>>

Which results in:

Queryable<{
x: number;
y: number;
z: number;
color: string;
}>

Which matches MixedEvent.

Map

Events cannot have a data field which is purely the undefined value, the value is used to delineate "don't emit an event" in callbacks that return the object data alone. This pattern is used due to a limitation in Typescript described in the next section.

function map<T, O>(
queryable: Queryable<T>,
mapper: (data: T, time: Time) => O extends undefined ? never : O,
): Queryable<O>

Using the never type in a conditional type, undefined can be disallowed as a return type for the mapper.

However, the implicit return of undefined by a bare return statement or the omission of a return statement is considered void instead of undefined, as a result, the following doesn't error:

function foo() {
return map(new DataSource<number>(), (data, time) => {
return
})
}

To capture our intent, we use void instead, which captures both undefined and the bare return, or lack of a return statement.

function map<T, O>(
queryable: Queryable<T>,
mapper: (data: T, time: Time) => O extends void ? never : O,
): Queryable<O>

This now errors correctly:

function foo() {
return map(new DataSource<number>(), (data, time) => {
return // Type 'void' is not assignable to type 'never'. ts(2345)
})
}

Advance

The advance operator is one such operator that uses an undefined return value to signal that no Event should be emitted that round.

function advance<R, O>(
queryable: Queryable<R>,
callback: (time: number) => O | undefined,
): Queryable<O extends void ? never : O>

If the return of an advance operator is statically analysable as always being void, the Queryable can be typed as Queryable<never> to remove it from the union created by a later interleave operator.

function foo() {
return advance(new DataSource<number>(), time => {
return undefined
})
}
// function foo(): Queryable<never>

Limitations

The iterateEmit operator gives raw access to the underlying API that powers the majority of other operators. It simply receives each event, and is allowed to emit other events.

function iterateEmit<T, O>(
queryable: Queryable<T>,
iterate: (event: Event<T>, emit: (event: Event<O>) => void) => void,
): Queryable<O>

Unfortunately, its return type cannot be inferred automatically from the usage of the emit callback. The promise constructor suffers a similar limitation.

function foo(queryable: Queryable<any>) {
return iterateEmit(queryable, (event, emit) => {
emit(new Event(event.time, 42)) // The output of iterateEmit should be Queryable<number>
}) // Instead it is inferred as Queryable<unknown>
}

Maybe one day this will be possible, but for now when using the operator, it must be manually type annotated. As a result of this limitation, operators like map require the event data to be returned by the callback, instead of having a separate emit callback.

Results

All charts, loggers and other consumers of Queryables are generic over the return type, resulting in autocomplete and compile-time checking of inputs and accessors.

interface LineChartProps<T> {
/**
* A reference to a `Queryable` for event injestion.
*/
dataSource: Queryable<T>
/**
* An accessor on the `Event`s data to produce a column of data. If the event is produced by a MessageQueryable,
* the eventData argument will be the payload of the message.
*/
accessor?: (data: T, time: number) => number
/**
* An accessor on the `Event`s data to produce the color for this point.
*/
colorAccessor?: (data: T, time: number) => Color
}

If complex DataFlows are used to process incoming data, their types are inferred and maintained throughout the pipeline.

const pos = new DataSource<XYZEvent>()
const col = new DataSource<ColorEvent>()
const mixed = colorMixer(col, pos)
const Page = () => {
return (
<ChartContainer>
<LineChart
dataSource={mixed}
accessor={data => data.q} // Errors!
colorAccessor={data => data.color} // Autocompletes!
/>
</ChartContainer>
)
}

Finally, here's the color mixer in action, combining the liftoff, thrusting, coasting and chutes deployed state with the XYZ position to color the flight path of a model rocket.

Rocket UI