Skip to content

Glossary

This page defines central terms in the Tenzir ecosystem.

Web user interface to access platform at app.tenzir.com.

The app is a web application that partially runs in the user’s browser. It is written in Svelte.

Maintains partition ownership and metadata.

The catalog is a component in the node that owns the partitions, keeps metadata about them, and maintains a set of sparse secondary indexes to identify relevant partitions for a given query. It offers a transactional interface for adding and removing partitions.

Manages chunks of raw bytes by interacting with a resource.

A connector is either a loader that acquires bytes from a resource, or a saver that sends bytes to a resource. Loaders are implemented as ordinary operators prefixed with load_* while savers are prefixed with save_*.

A stateful object used for in-band enrichment.

Contexts come in various types, such as a lookup table, Bloom filter, and GeoIP database. They live inside a node and you can enrich with them in other pipelines.

An pipeline ending with an output operator preceded by a subscribe input operator.

The indexed storage that pipelines can use at the node. Every node has a light-weight storage engine for importing and exporting events. You must mount the storage into the node such that it can be used from pipelines using the import and export operators. The storage cengine comes with a catalog that tracks partitions and keeps sparse indexes to accelerate historical queries.

A record of typed data. Think of events as JSON objects, but with a richer type system that also has timestamps, durations, IP addresses, and more. Events have fields and can contain numerous shapes that describe its types (= the schema).

Translates between bytes and events.

A format is either supported by a parser that converts bytes to events, or a printer that converts events to bytes. Example formats are JSON, CEF, or PCAP.

Computes something over a value in an event. Unlike operators that work on streams of events, functions can only act on single values.

Optional data structures for accelerating queries involving the node’s edge storage.

Tenzir featres in-memory sparse indexes that point to partitions.

An operator that only producing data, without consuming anything.

A set of pipelines to integrate with a third-party product.

An integration describes use cases in combination with a specific product or tool. Based on the depth of the configuration, this may require configuration on either end.

A collection of packages.

Our community library is freely available at GitHub.

A connector that acquires bytes.

A loader is the dual to a saver. It has a no input and only performs a side effect that acquires bytes. Use a loader implicitly with the from operator or explicitly with the load_* operators.

A host for pipelines and storage reachable over the network.

The tenzir-node binary starts a node in a dedicated server process that listens on TCP port 5158.

Runtime statistics about the node and pipeline execution.

The Open Cybersecurity Schema Framework (OCSF) is a cross-vendor schema for security event data. Our community library contains packages that map data sources to OCSF.

The building block of a pipeline.

An operator is an input, a transformation, or an output.

An operator consuming data, without producing anything.

The acronym PaC stands for Pipelines as Code. It is meant as an adaptation of Infrastructure as Code (IaC) with pipelines represent the (data) infrastructure that is provisioning as code.

A collection of pipelines and contexts.

A bytes-to-events operator.

A parser is the dual to a printer. You use a parser implicitly in the from operator, or via the read_* operators. There exist also functions for applying parsers to string values.

The horizontal scaling unit of the storage attached to a node.

A partition contains the raw data and optionally a set of indexes. Supported formats are Parquet or Feather.

Combines a set of operators into a dataflow graph.

The control plane for nodes and pipelines, accessible at app.tenzir.com.

An events-to-bytes operator.

A format that translates events into bytes.

A printer is the dual to a parser. Use a parser implicitly in the to operator.

A connector that emits bytes.

A saver is the dual to a loader. It has a no output and only performs a side effect that emits bytes. Use a saver implicitly with the to operator or explicitly with the save_* operators.

A top-level record type of an event.

An pipeline starting with an input operator followed by a publish output operator.

An acronym for Tenzir Query Language.

TQL is the language in which users write pipelines.

An operator consuming both input and producing output.