from
Obtains events from an URI, inferring the source, compression and format.
from uri:string, [loader_args… { … }]from events…
Description
Section titled “Description”The from
operator is an easy way to get data into Tenzir.
It will try to infer the connector, compression and format based on the given URI.
Alternatively, it can be used to create events from records.
uri: string
Section titled “uri: string”The URI to load from.
loader_args… (optional)
Section titled “loader_args… (optional)”An optional set of arguments passed to the loader. This can be used to e.g. pass credentials to a connector:
from "https://example.org/file.json", headers={Token: "XYZ"}
{ … } (optional)
Section titled “{ … } (optional)”The optional pipeline argument allows for explicitly specifying how from
decompresses and parses data. By default, the pipeline is inferred based on a set
of rules.
If inference is not possible, or not sufficient, this argument can be used to control the decompression and parsing. Providing this pipeline disables the inference. Examples
events…
Section titled “events…”Instead of a URI, you can also provide one or more records, which will be the operators output. This is mostly useful for testing pipelines without loading actual data.
Explanation
Section titled “Explanation”Loading a resource into tenzir consists of three steps:
- Loading the raw bytes
- Decompressing (optional)
- Reading the bytes as structured data
The from
operator tries to infer all three steps from the given URI.
Loading
Section titled “Loading”The connector is inferred based on the URI scheme://
. See the URI schemes
section for supported schemes. If no scheme is present, the
connector attempts to load from the filesystem.
Decompressing
Section titled “Decompressing”The compression is inferred from the “file-ending” in the URI. Under the hood,
this uses the decompress_*
operators.
Supported compressions can be found in the list of compression
extensions.
The decompression step is optional and will only happen if a compression could be inferred. If you know that the source is compressed and the compression cannot be inferred, you can use the pipeline argument to specify the decompression manually.
Reading
Section titled “Reading”The format to read is, just as the compression, inferred from the file-ending.
Supported file formats are the common file endings for our read_*
operators.
If you want to provide additional arguments to the parser, you can use the
pipeline argument to specify the parsing manually. This can be
useful, if you e.g. know that the input is suricata
or ndjson
instead of
just plain json
.
The pipeline argument & its relation to the loader
Section titled “The pipeline argument & its relation to the loader”Some loaders, such as the load_tcp
operator, accept a sub-pipeline
directly. If the selected loader accepts a sub-pipeline, the from
operator
will dispatch decompression and parsing into that sub-pipeline. If a an explicit
pipeline argument is provided it is forwarded as-is. If the loader does not
accept a sub-pipeline, the decompression and parsing steps are simply performed
as part of the regular pipeline.
Example transformation:
Section titled “Example transformation:”from "myfile.json.gz"
load_file "myfile.json.gz"decompress_gzipread_json
Example with pipeline argument:
Section titled “Example with pipeline argument:”from "tcp://0.0.0.0:12345", parallel=10 { read_gelf}
load_tcp "tcp://0.0.0.0:12345", parallel=10 { read_gelf}
Supported Deductions
Section titled “Supported Deductions”URI schemes
Section titled “URI schemes”Scheme | Operator | Example |
---|---|---|
abfs ,abfss | load_azure_blob_storage | from "abfs://path/to/file.json" |
amqp | load_amqp | from "amqp://… |
elasticsearch | from_opensearch | from "elasticsearch://1.2.3.4:9200 |
file | load_file | from "file://path/to/file.json" |
fluent-bit | from_fluent_bit | from "fluent-bit://elasticsearch" |
ftp , ftps | load_ftp | from "ftp://example.com/file.json" |
gcps | load_google_cloud_pubsub | from "gcps://project_id/subscription_id" { … } |
gs | load_gcs | from "gs://bucket/object.json" |
http , https | load_http | from "http://example.com/file.json" |
inproc | load_zmq | from "inproc://127.0.0.1:56789" { read_json } |
kafka | load_kafka | from "kafka://topic" { read_json } |
opensearch | from_opensearch | from "opensearch://1.2.3.4:9200 |
s3 | load_s3 | from "s3://bucket/file.json" |
sqs | load_sqs | from "sqs://my-queue" { read_json } |
tcp | load_tcp | from "tcp://127.0.0.1:13245" { read_json } |
udp | load_udp | from "udp://127.0.0.1:56789" { read_json } |
zmq | load_zmq | from "zmq://127.0.0.1:56789" { read_json } |
Please see the respective operator pages for details on the URI’s locator format.
File extensions
Section titled “File extensions”Format
Section titled “Format”The from
operator can deduce the file format based on these file-endings:
Format | File Endings | Operator |
---|---|---|
CSV | .csv | read_csv |
Feather | .feather , .arrow | read_feather |
JSON | .json | read_json |
NDJSON | .ndjson , .jsonl | read_ndjson |
Parquet | .parquet | read_parquet |
Pcap | .pcap | read_pcap |
SSV | .ssv | read_ssv |
TSV | .tsv | read_tsv |
YAML | .yaml | read_yaml |
Compression
Section titled “Compression”The from
operator can deduce the following compressions based on these
file-endings:
Compression | File Endings |
---|---|
Brotli | .br , .brotli |
Bzip2 | .bz2 |
Gzip | .gz , .gzip |
LZ4 | .lz4 |
Zstd | .zst , .zstd |
Examples
Section titled “Examples”Load a local file
Section titled “Load a local file”from "path/to/my/load/file.csv"
Load a compressed file
Section titled “Load a compressed file”from "path/to/my/load/file.json.bz2"
Load a file with parser arguments
Section titled “Load a file with parser arguments”Provide an explicit header to the CSV parser:
from "path/to/my/load/file.csv.bz2" { decompress_brotli // this is now necessary due to the pipeline argument read_csv header="col1,col2,col3"}
Pick a more suitable parser
Section titled “Pick a more suitable parser”The file eve.json
contains Suricata logs, but the from
operator does not
know this. We provide an explicit read_suricata
instead:
from "path/to/my/load/eve.json" { read_suricata}
Load from HTTP with a header
Section titled “Load from HTTP with a header”from "https://example.org/file.json", headers={Token: "1234"}
Create events from records
Section titled “Create events from records”from {message: "Value", endpoint: {ip: 127.0.0.1, port: 42}}, {message: "Value", endpoint: {ip: 127.0.0.1, port: 42}, raw: "text"}, {message: "Value", endpoint: null}
{ message: "Value", endpoint: { ip: 127.0.0.1, port: 42 }}{ message: "Value", endpoint: { ip: 127.0.0.1, port: 42 }, raw: "text"}{ message: "Value", endpoint: null}