Skip to content

Shape data

Tenzir comes with numerous transformation operators that do change the the shape of their input and produce a new output. Here is a visual overview of transformations that you can perform over a data frame:

ABCDABCDABCDtastewhereABCDABCDsetABCDABCDsetEABCDBsummarizeEABCDdeduplicateABCDABCDABCDsortABCDAFCEsetABCDABCDtailABCDABCDheadABCDABCDBselect dropDErename fieldsmutate valuesadd fieldsABCDreverseABCDproject fieldsfilter eventslast Nfirst Nsample schemasremove dupesreorderaggregateflip order/

We’ll walk through examples for each depicted operator, using the M57 dataset. All examples assume that you have imported the M57 sample data into a node, as explained in the quickstart. We therefore start every pipeline with export.

Filter events with where

Use where to filter events in the input with an expression:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"}
where x != 2 and y.starts_with("b")
{x: 3, y: "baz"}

Slice events with head, tail, and slice

Use the head and tail operators to get the first or last N records of the input.

Get the first event:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"}
head 1
{x: 1, y: "foo"}

Get the last two events:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"}
tail 2
{x: 2, y: "bar"}
{x: 3, y: "baz"}

The slice operator generalizes head and tail by allowing for more flexible slicing. For example, to return every other event starting from the third:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"},
{x: 4, y: "qux"},
{x: 5, y: "corge"},
{x: 6, y: "grault"}
slice begin=3, stride=2
{x: 4, y: "qux"}
{x: 6, y: "grault"}

Pick fields with select and drop

Use the select operator to pick fields:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"}
select x
{x: 1}
{x: 2}
{x: 3}

The drop operator is the dual to select and removes the specified fields:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 3, y: "baz"}
drop x
{y: "foo"}
{y: "bar"}
{y: "baz"}

Sample schemas with taste

The taste operator provides a sample of the first N events of every unique schemas. For example, to get 3 unique samples:

from {x: 1, y: "foo"},
{x: 2, y: "bar"},
{x: 1},
{x: 2},
{y: "foo"}
taste 1
{x: 1, y: "foo"}
{x: 1}
{y: "foo"}

Add and rename fields with set assignment

Use the set operator to add new fields to the output.

from {x: 1},
{x: 2}
set y = x + 1
{x: 1, y: 2}
{x: 2, y: 3}

Rename fields by combining set with drop:

from {x: 1},
{x: 2}
set y=x
drop x
{y: 1}
{y: 2}

Similarly, you can rename and project at the same time with select:

from {x: 1, y: "foo"},
{x: 2, y: "bar"}
select y=x
{y: 1}
{y: 2}

Aggreate events with summarize

Use summarize to group and aggregate data.

from {x: 0, y: 0, z: 1},
{x: 1, y: 1, z: 2},
{x: 1, y: 1, z: 3}
summarize y, x=sum(x)
{y: 0, x: 0}
{y: 1, x: 2}

A variety of aggregation functions make it possible to combine grouped data.

Reorder events with sort

Use sort to arrange the output records according to the order of a specific field.

from {x: 2, y: "bar"},
{x: 3, y: "baz"},
{x: 1, y: "foo"}
sort -x
{x: 3, y: "baz"}
{x: 2, y: "bar"}
{x: 1, y: "foo"}

Prepending the field with - reverses the sort order.