Concepts
Learn about the core concepts of Synnax.
Synnax is a streaming time-series database designed to acquire, store, and transmit real-time data from hardware systems. It scales horizontally, and can be deployed on edge services for data acquisition or in cloud environments for high-performance analysis.
Synnax inherits a hybrid pedigree from hardware data acquisition (DAQ) systems and cloud-native, horizontally scalable databases. This page introduces the core concepts needed to effectively integrate a Synnax cluster into your systems.
Distribution Components
Nodes
A node is an individual, running instance of the Synnax executable. The host machine can be an edge device, VM, container, or bare metal server. The only requirement is that it can store data on disk and has an address reachable by other nodes in the cluster.
Clusters
Nodes communicate with each other to form a cluster. The nodes in a cluster collaborate to read, write, and exchange data. Nodes expose the cluster as a monolothic data space, meaning that a user can query a single node for the entire cluster’s data without being aware of where it is actually stored.
Data Components
The data components of a cluster are the building blocks for storing, retrieving, and streaming hardware telemetry data. There are five data components in Synnax: samples, channels, ranges, series, and frames.
Samples
A sample is a single value recorded at a specific point in time. When dealing with hardware, samples are most commonly numeric sensor readings, such as temperature or pressure. Samples can also be actuator commands, status codes, or any other strongly typed value.
Channels
A channel is a logical collection of samples emitted by or representing the values of a single source. This source is typically a sensor, although channels can also hold values like actuator commands, status codes, log messages, or enumerations.
In Synnax, samples written to a channel are both stored permanently and streamed in real-time. Live transmission allows for consumers (called streamers) to receive and process real-time data. Common use cases for data streaming include visualization, anomaly detection, and automated control.
Combined data storage and streaming is a key feature of Synnax, enabling many powerful workflows that would otherwise require multiple systems to implement.
Channels are comparable to tags in a time-series database like InfluxDB or SCADA system like Ignition.
Ranges
A range (short for time range) is a user-defined time interval that labels interesting events in your data. While channels are used to group samples by source, ranges are used to group related samples by time period.
In addition to being the primary method for querying data from a cluster, ranges are also a powerful tool for attaching crucial meta-data to telemetry values.
In a test operations context, ranges can be used to label specific test runs, attaching information like calibration data, test parameters, and procedures. In manufacturing and monitoring, ranges can be used to label anomalies, identify maintenance periods, or annotate automated control actions.
Series
While channels and ranges group related data, series do the heavy lifting in terms of storing and transmitting actual samples. A series is comparable to a list or array in other programming languages. When reading or writing data to Synnax, you’ll frequently work with channel data as a collection of one or more series.
Frames
A frame is a collection of related series. These series form a table-like structure
comparable to a pandas DataFrame
in Python or a data.frame
in R. Each column is
identified by a channel, holding one or more series for that channel. Frames are the
fundamental unit of data transfer within Synnax, and are very useful for effectively
working with data from multiple channels at once.
Operation Components
The operation components are key interfaces that allows users to access and modify the samples in a cluster.
Writers
Writers are used to write samples to a cluster. They can be used to write static data in large batches or stream data in real-time. A writer can be opened on multiple channels, where each frame contains series with samples for each channel. Writers support atomic transactions, meaning that all samples in a frame are written to the cluster or none are. This is particularly useful when reading in data from large files.
Writers also support dynamic control handoff, which is when multiple writers can be opened on the same channel, but only a subset (typically one) of the writers is actually allowed to write to the channel at any given time. This is useful for transitioning control between manual operators and automated systems.
Iterators / Readers
The primary method for reading data from a cluster is through an iterator. Iterators read data in a streaming fashion, allowing users to efficiently query and process large quantities of data. They can be opened on one or more channels to read historical data across a specific range of time. Iterators are sometimes called “readers”.
Streamers
Streamers are used to stream data in real-time. They can be thought of as a ‘subscriber’ in a traditional publish-subscribe system. Like iterators, streamers can be opened on one or more channels to receive data as it is being written. Streamers are useful for live plotting, control, and real-time post processing.
Deletes
Deletes are used to remove samples from a cluster.