Skip to content

Job Inputs

Inputs determine how events enter a job. Every job selects exactly one input and can optionally attach a trigger to control cadence. The visual editor exposes the most common settings; the DSL reference covers the full schema for advanced scenarios.

Supported inputs

InputTrigger styleCommon usesNotes
azure-blobScheduled pollIngest blobs from Azure storage accounts.Supports wildcard object selection and incremental checkpoints.
echoManual or triggerlessSeed jobs with inline sample data.Handy for quick tests and template jobs.
execScheduled pollRun a command and capture STDOUT as events.Ship logs from legacy tools without building wrappers.
file-storeScheduled pollRead from managed FileStore buckets.Best for hybrid deployments that mirror data into the control plane.
filesContinuousTail directories or batch uploaded files.Detects new files and streams them line by line.
gcsScheduled pollPull objects from Google Cloud Storage.Shares batching semantics with S3 and Azure connectors.
http-pollScheduled pollCall REST APIs on a cadence.Configure headers, query params, and request bodies; supports retries and backoff.
http-serverEvent-drivenAccept inbound webhooks.Runs an embedded HTTP listener on the worker.
internal-messagesEvent-drivenReact to platform events or job-emitted messages.Filter by kind, source, type, job, or tag.
s3Scheduled pollRead from Amazon S3 buckets.Handles compressed objects and streams large files.
windows-event-logEvent-drivenCapture Windows system events.Available on Windows workers only.
worker-channelEvent-drivenChain jobs together inside a worker.Consumes messages emitted by upstream jobs in the same worker.

If you do not see a specific connector in the UI, check licensing or contact Lyft Data support; some inputs are gated by feature flags.

Scheduling and triggers

Inputs that poll external systems expose a Trigger block. Choose between cron expressions, fixed intervals, or internal message triggers. The scheduler guarantees that each job run receives a distinct window, and you can offset or jitter schedules to avoid thundering herds. Continuous inputs such as files, http-server, and worker-channel ignore the trigger block. They stay active as long as the job runtime is running.

Parsing options

Most inputs share two common switches:

  • JSON: Treat payloads as JSON documents so fields are available without extra parsing. Leave disabled when ingesting free-form text; the runtime will wrap the payload in _raw instead.
  • Ignore line breaks: Combine the entire payload into a single event. Enable when a single response spans multiple lines (for example, pretty-printed JSON from an HTTP API).

Document-oriented connectors (exec, http-poll, the object stores) group the events produced during one fetch into a document. Downstream actions can reference document metadata or preserve the grouping by using output batching in Document mode.

Reliability controls

Inputs that reach out to external services include retry settings. Configure max_attempts and backoff to tune resilience without overwhelming the upstream system. Backoff doubles on each retry up to a 15-second ceiling. Pair these settings with monitoring alerts so you know when a connector is repeatedly failing.

For field-level schemas and advanced options, consult the DSL input reference.