Node
A node is a running process that manages and executes pipelines.
When a node starts, it will automatically attempt to connect to the platform, giving you a seamless way to manage and deploy pipelines through a web interface. However, using the platform is optional—you can also be manually manage pipelines via the node’s REST API.
Standalone vs. Managed Pipeline Execution
Section titled “Standalone vs. Managed Pipeline Execution”To understand the benefits of a node, let’s first consider how you can run
pipelines without one. You run a single pipeline directly from the command line
using the tenzir
binary:
This standalone execution mode of a pipeline is ideal for ad-hoc data
transformations akin to how one would use jq
, but with much broader data
handling capabilities.
For continuous and more dependable data processing, you will quickly realize that you also need scheduled execution, automatic restarting, monitoring of warnings/errors, and more advanced execution capabilities, like real-time enrichment with contextual data or correlation with historical data.
This is where a node comes into play, offering a vehicle to execute one or more
pipelines in a managed fashion. You can spawn a node with the tenzir-node
binary or by running the Docker container that contains this binary: