Job Inputs

Inputs determine how events enter a job. Every job selects exactly one input and can optionally attach a trigger to control cadence. The visual editor exposes the most common settings; the DSL reference covers the full schema for advanced scenarios.

Supported inputs

Input	Trigger style	Common uses	Notes
`azure-blob`	Scheduled poll	Ingest blobs from Azure storage accounts.	Supports wildcard object selection and incremental checkpoints.
`echo`	Manual or triggerless	Seed jobs with inline sample data.	Handy for quick tests and template jobs.
`exec`	Scheduled poll	Run a command and capture STDOUT as events.	Ship logs from legacy tools without building wrappers.
`file-store`	Scheduled poll	Read from managed FileStore buckets.	Best for hybrid deployments that mirror data into the control plane.
`files`	Continuous	Tail directories or batch uploaded files.	Detects new files and streams them line by line.
`gcs`	Scheduled poll	Pull objects from Google Cloud Storage.	Shares batching semantics with S3 and Azure connectors.
`http-poll`	Scheduled poll	Call REST APIs on a cadence.	Configure headers, query params, and request bodies; supports retries and timeouts.
`http-server`	Event-driven	Accept inbound webhooks.	Runs an embedded HTTP listener on the worker.
`internal-messages`	Event-driven	React to platform events or job-emitted messages.	Filter by kind, source, type, job, or tag.
`s3`	Scheduled poll	Read from Amazon S3 buckets.	Handles compressed objects and streams large files.
`windows-event-log`	Event-driven	Capture Windows system events.	Available on Windows workers only.
`worker-channel`	Event-driven	Chain jobs together inside a worker.	Consumes messages emitted by upstream jobs in the same worker.

If you do not see a specific connector in the UI, check licensing or contact Lyft Data support; some inputs are gated by feature flags.

Scheduling and triggers

Inputs that poll external systems expose a Trigger block. Choose between cron expressions, fixed intervals, or internal message triggers. The scheduler guarantees that each job run receives a distinct window, and you can offset or jitter schedules to avoid thundering herds. Continuous inputs such as files, http-server, and worker-channel ignore the trigger block. They stay active as long as the job runtime is running.

Parsing options

Most inputs share two common switches:

JSON: Treat payloads as JSON documents so fields are available without extra parsing. Leave disabled when ingesting free-form text; the runtime will wrap the payload in _raw instead.
Ignore line breaks: Combine the entire payload into a single event. Enable when a single response spans multiple lines (for example, pretty-printed JSON from an HTTP API).

Document-oriented connectors (exec, http-poll, the object stores) group the events produced during one fetch into a document. Downstream actions can reference document metadata or preserve the grouping by using output batching in Document mode.

Reliability controls

Inputs that reach out to external services include a retry block. Use:

retry.timeout: per-attempt timeout (defaults to 30s).
retry.retries: number of retry attempts (omit for unlimited).

Most connectors apply a bounded backoff internally between attempts; keep retries low to avoid overwhelming an unhealthy upstream system.

For field-level schemas and advanced options, consult the DSL input reference.