Job Inputs
Inputs determine how events enter a job. Every job selects exactly one input and can optionally attach a trigger to control cadence. The visual editor exposes the most common settings; the DSL reference covers the full schema for advanced scenarios.
Supported inputs
| Input | Trigger style | Common uses | Notes |
|---|---|---|---|
azure-blob | Scheduled poll | Ingest blobs from Azure storage accounts. | Supports wildcard object selection and incremental checkpoints. |
echo | Manual or triggerless | Seed jobs with inline sample data. | Handy for quick tests and template jobs. |
exec | Scheduled poll | Run a command and capture STDOUT as events. | Ship logs from legacy tools without building wrappers. |
file-store | Scheduled poll | Read from managed FileStore buckets. | Best for hybrid deployments that mirror data into the control plane. |
files | Continuous | Tail directories or batch uploaded files. | Detects new files and streams them line by line. |
gcs | Scheduled poll | Pull objects from Google Cloud Storage. | Shares batching semantics with S3 and Azure connectors. |
http-poll | Scheduled poll | Call REST APIs on a cadence. | Configure headers, query params, and request bodies; supports retries and backoff. |
http-server | Event-driven | Accept inbound webhooks. | Runs an embedded HTTP listener on the worker. |
internal-messages | Event-driven | React to platform events or job-emitted messages. | Filter by kind, source, type, job, or tag. |
s3 | Scheduled poll | Read from Amazon S3 buckets. | Handles compressed objects and streams large files. |
windows-event-log | Event-driven | Capture Windows system events. | Available on Windows workers only. |
worker-channel | Event-driven | Chain jobs together inside a worker. | Consumes messages emitted by upstream jobs in the same worker. |
If you do not see a specific connector in the UI, check licensing or contact Lyft Data support; some inputs are gated by feature flags.
Scheduling and triggers
Inputs that poll external systems expose a Trigger block. Choose between cron expressions, fixed intervals, or internal message triggers. The scheduler guarantees that each job run receives a distinct window, and you can offset or jitter schedules to avoid thundering herds. Continuous inputs such as files, http-server, and worker-channel ignore the trigger block. They stay active as long as the job runtime is running.
Parsing options
Most inputs share two common switches:
- JSON: Treat payloads as JSON documents so fields are available without extra parsing. Leave disabled when ingesting free-form text; the runtime will wrap the payload in
_rawinstead. - Ignore line breaks: Combine the entire payload into a single event. Enable when a single response spans multiple lines (for example, pretty-printed JSON from an HTTP API).
Document-oriented connectors (exec, http-poll, the object stores) group the events produced during one fetch into a document. Downstream actions can reference document metadata or preserve the grouping by using output batching in Document mode.
Reliability controls
Inputs that reach out to external services include retry settings. Configure max_attempts and backoff to tune resilience without overwhelming the upstream system. Backoff doubles on each retry up to a 15-second ceiling. Pair these settings with monitoring alerts so you know when a connector is repeatedly failing.
For field-level schemas and advanced options, consult the DSL input reference.