Job Inputs
Inputs determine how events enter a job. Every job selects exactly one input and can optionally attach a trigger to control cadence. The visual editor exposes the most common settings; the DSL reference covers the full schema for advanced scenarios.
Supported inputs
| Input | Trigger style | Common uses | Notes |
|---|---|---|---|
azure-blob | Scheduled poll | Ingest blobs from Azure storage accounts. | Supports wildcard object selection and incremental checkpoints. |
echo | Manual or triggerless | Seed jobs with inline sample data. | Handy for quick tests and template jobs. |
exec | Scheduled poll | Run a command and capture STDOUT as events. | Ship logs from legacy tools without building wrappers. |
file-store | Scheduled poll | Read from managed FileStore buckets. | Best for hybrid deployments that mirror data into the control plane. |
files | Continuous | Tail directories or batch uploaded files. | Detects new files and streams them line by line. |
gcs | Scheduled poll | Pull objects from Google Cloud Storage. | Shares batching semantics with S3 and Azure connectors. |
http-poll | Scheduled poll | Call REST APIs on a cadence. | Configure headers, query params, and request bodies; supports retries and timeouts. |
http-server | Event-driven | Accept inbound webhooks. | Runs an embedded HTTP listener on the worker. |
internal-messages | Event-driven | React to platform events or job-emitted messages. | Filter by kind, source, type, job, or tag. |
s3 | Scheduled poll | Read from Amazon S3 buckets. | Handles compressed objects and streams large files. |
windows-event-log | Event-driven | Capture Windows system events. | Available on Windows workers only. |
worker-channel | Event-driven | Chain jobs together inside a worker. | Consumes messages emitted by upstream jobs in the same worker. |
If you do not see a specific connector in the UI, check licensing or contact Lyft Data support; some inputs are gated by feature flags.
Scheduling and triggers
Inputs that poll external systems expose a Trigger block. Choose between cron expressions, fixed intervals, or internal message triggers. The scheduler guarantees that each job run receives a distinct window, and you can offset or jitter schedules to avoid thundering herds. Continuous inputs such as files, http-server, and worker-channel ignore the trigger block. They stay active as long as the job runtime is running.
Parsing options
Most inputs share two common switches:
- JSON: Treat payloads as JSON documents so fields are available without extra parsing. Leave disabled when ingesting free-form text; the runtime will wrap the payload in
_rawinstead. - Ignore line breaks: Combine the entire payload into a single event. Enable when a single response spans multiple lines (for example, pretty-printed JSON from an HTTP API).
Document-oriented connectors (exec, http-poll, the object stores) group the events produced during one fetch into a document. Downstream actions can reference document metadata or preserve the grouping by using output batching in Document mode.
Reliability controls
Inputs that reach out to external services include a retry block. Use:
retry.timeout: per-attempt timeout (defaults to 30s).retry.retries: number of retry attempts (omit for unlimited).
Most connectors apply a bounded backoff internally between attempts; keep retries low to avoid overwhelming an unhealthy upstream system.
For field-level schemas and advanced options, consult the DSL input reference.