Skip to content

Working With Data

Lyft Data treats each event as a JSON object. Text-first inputs typically wrap payloads in _raw (or another configured field), and actions mutate the event in place. This guide shows common patterns for turning raw streams into structured fields and preparing payloads for downstream systems.

JSON as the baseline

  • For text inputs, the raw payload usually lands in _raw.
  • Prefer json: true on inputs that already emit JSON so fields are available without an extra parsing step.
  • The exec input can place stdout/stderr/exit status into dedicated fields via the result block.

Example: capture a command’s stdout in uptime_raw:

input:
exec:
command: uptime
result:
stdout-field: uptime_raw

Extract unstructured text with extract

Use extract when a regular expression is the fastest path to structure. Captured groups become fields (either by name or by output-fields order).

actions:
- extract:
input-field: uptime_raw
pattern: 'load average: (\S+), (\S+), (\S+)'
output-fields:
- load1
- load5
- load15
remove: true
drop: false

By default, extract emits warnings when the pattern does not match. Set suppress-warnings: true to silence them, or drop: true to stop non-matching events.

Convert values and units

Convert extracted strings into typed values:

actions:
- convert:
conversions:
- field: load1
conversion: num
- field: load5
conversion: num
- field: load15
conversion: num

Unit conversions work the same way:

actions:
- convert:
units:
- field: latency
from: ms
to: s

If conversions may see empty values, pick a null-behaviour (keep or default) to avoid surprises.

Parse structured text

CSV

Parse CSV text into JSON fields with the csv action:

actions:
- csv:
input-field: _raw
header: true
delim: ','
fields:
port: num
throughput: num
remove: true

Key/value logs

Parse k=v style payloads with key-value:

actions:
- key-value:
input-field: _raw
delim: ' '
key-value-delim: '='
autoconvert: true
remove: true

If the producer repeats keys, pick a strategy with multiple: array (collect all values), first, or last.

Split one record into many with expand-events

Use expand-events to fan out embedded arrays into multiple events:

actions:
- expand-events:
input-field: results
output-split-field: result
skip-list:
- /metadata
remove: true

Parse embedded JSON strings

If a field contains JSON as a string, use the json action:

actions:
- json:
input-field: payload
remove: true

Preparing non-JSON output

Outputs are JSON-first, but you can send custom HTTP bodies with http-post:

  • Build a payload field using add (strings support ${} runtime expansions).
  • Send that field via http-post.body.field.
actions:
- add:
output-fields:
payload: |
token={{secrets.pushover_token}}&user={{secrets.pushover_user_key}}
message=Hello from LyftData! host=${host||unknown}
output:
http-post:
url: https://api.example.com/ingest
headers:
Content-Type: application/x-www-form-urlencoded
body:
field:
field: payload

Handy toggles and safeguards

  • remove: delete source fields once parsing succeeds.
  • suppress-warnings: silence parsing warnings when a data source is noisy.
  • drop (extract): stop events when validation fails.