Skip to content

Overview

LyftData is a unified data operations platform that lets teams connect sources, transform records, and deliver results without stitching together bespoke scripts. Describe the pipeline once and the platform owns scheduling, retries, and observability.

Quick start checklist

  • Confirm prerequisites – Review system requirements to make sure the target machine has enough CPU, RAM, and disk headroom.
  • Download the release bundle – Grab the latest binaries from Downloads; Community Edition runs without a license, but you can review licensing upgrades if you plan a production rollout. If you just need a trial, jump straight to the Evaluation Quickstart.
  • Bring up the control plane – Follow the platform-specific guide (e.g., Linux server install) to start the server; the built-in worker comes online automatically in Community Edition. Evaluators can complete the single-host setup in ~15 minutes using the quickstart.
  • Ship your first job – Use Getting Started for a guided tour of the UI, then complete Running a Job to validate end-to-end execution.
  • Harden for production – Consult the Install & configure section for TLS, backups, and multi-worker runbooks before touching real datasets.

Each step above links to a short walkthrough so you can stay in-context without hunting across the docs.

Why teams choose LyftData

  • Connect data fast: pull from databases, SaaS APIs, files, or object stores using built-in connectors.
  • Shape pipelines safely: mix built-in actions (filters, enrichers, preprocessors, etc.) with Run & Trace so you can verify every change.
  • Operate with confidence: centralized logging, metrics, and alerting keep operations predictable even as workloads grow.

Architecture basics (for trials)

In an evaluation setup, you can run everything on a single machine:

Browser / CLI ──► Server (UI + API + scheduling)
└─► Built-in worker ──► Sources / Destinations
  • The server is the control plane: UI + API, job definitions, scheduling, and deployment coordination.
  • The worker runs the pipelines: it connects to sources, transforms events, and writes to destinations.
  • In Community Edition, the built-in worker runs alongside the server by default, so you don’t need extra hosts to validate end-to-end execution.

Example pipeline

name: analytics-sync
input:
http-poll:
url: https://api.example.com/events
json: true
response:
status-field: http_status
response-field: body
trigger:
interval:
duration: 1h
actions:
- filter:
how:
expression: http_status == 200
- add:
output-fields:
exported_at: "${time|now_time_iso}"
output:
s3:
bucket-name: lyftdata-exports
object-name:
name: "analytics/${time|now_time_fmt %Y/%m/%d}/events.jsonl"
disable-object-name-guid: true

This job polls an API hourly, filters successful responses, stamps each record, and persists to S3. Chain additional jobs through Worker channels when you need fan-out, enrichment, or multi-step processing.

Your first five-minute pipeline

  • Prerequisites – Confirm the server is running and the built-in worker shows Online using the Getting Started checklist. You do not need a license while you remain on Community Edition.
  • Walkthrough – Follow the First pipeline guide for the full download, install, and deploy flow.
  • What you’ll build – Create a simple pipeline in the visual editor, inspect it with Run & Trace, then deploy it to your worker.

Want the UI-only tour? Jump to Running a Job.

Preparing for production

  • Observability baseline – Turn on metrics shipping and review the Troubleshooting guide so you know what “healthy” looks like before launch.
  • Scaling playbooks – Plan how many workers you need once you upgrade beyond Community Edition, and decide whether they live near sources or destinations. The advanced scheduling guide covers common patterns.
  • Security review – Walk through TLS, RBAC, and secret storage checklists in the Install section prior to onboarding real customers.

Where to go next

With these foundations in place, you can iterate quickly on pipelines while the platform handles orchestration, scaling, and observability.