Overview

Lyft Data is a unified data operations platform that lets teams connect sources, transform records, and deliver results without stitching together bespoke scripts. Describe the pipeline once and the platform owns scheduling, retries, and observability.

Quick start checklist

Confirm prerequisites – Review system requirements to make sure the target machine has enough CPU, RAM, and disk headroom.
Download the release bundle – Grab the latest binaries from Downloads; Community Edition runs without a license, but you can review licensing upgrades if you plan a production rollout. If you just need a trial, jump straight to the Evaluation Quickstart.
Bring up the control plane – Follow the platform-specific guide (e.g., Linux server install) to start the server; the built-in worker comes online automatically in Community Edition. Evaluators can complete the single-host setup in ~15 minutes using the quickstart.
Ship your first job – Use Getting Started for a guided tour of the UI, then complete Running a Job to validate end-to-end execution.
Harden for production – Consult the Install & configure section for TLS, backups, and multi-worker runbooks before touching real datasets.

Each step above links to a short walkthrough so you can stay in-context without hunting across the docs.

Why teams choose Lyft Data

Connect data fast: pull from databases, SaaS APIs, files, or object stores using built-in connectors.
Shape pipelines safely: mix built-in actions (filters, enrichers, preprocessors, etc.) with Run & Trace so you can verify every change.
Operate with confidence: centralized logging, metrics, and alerting keep operations predictable even as workloads grow.

Architecture at a glance

Sources ──► Workers ──► Destinations
              ▲          │
              │          ├─ Warehouses / lakes
Lyft Data     │          ├─ Search indexes / queues
Server (UI/API└──────────└─ Downstream services
 + scheduling)

The Server stores job definitions, coordinates deployments, and exposes both the UI and API. Jobs describe how data moves, and Workers execute pipelines close to the data or users. In Community Edition the built-in worker runs alongside the server so a single-host setup works out of the box; licensed deployments can add external workers to scale out or place execution near specific systems.

Example pipeline

name: analytics-sync
input:
  http-poll:
    url: https://api.example.com/events
    json: true
    response:
      status-field: http_status
      response-field: body
    trigger:
      interval:
        duration: 1h
actions:
  - filter:
      how:
        expression: http_status == 200
  - add:
      output-fields:
        exported_at: "${time|now_time_iso}"
output:
  s3:
    bucket-name: lyftdata-exports
    object-name:
      name: "analytics/${time|now_time_fmt %Y/%m/%d}/events.jsonl"
    disable-object-name-guid: true

This job polls an API hourly, filters successful responses, stamps each record, and persists to S3. Chain additional jobs through Worker channels when you need fan-out, enrichment, or multi-step processing.

Your first five-minute pipeline

Prerequisites – Confirm the server is running and the built-in worker shows Online using the Getting Started checklist. You do not need a license while you remain on Community Edition.
Walkthrough – Follow the First pipeline guide for the full download, install, and deploy flow.
What you’ll build – Create a simple pipeline in the visual editor, inspect it with Run & Trace, then deploy it to your worker.

Want the UI-only tour? Jump to Running a Job.

Preparing for production

Observability baseline – Turn on metrics shipping and review the Troubleshooting guide so you know what “healthy” looks like before launch.
Scaling playbooks – Plan how many workers you need once you upgrade beyond Community Edition, and decide whether they live near sources or destinations. The advanced scheduling guide covers common patterns.
Security review – Walk through TLS, RBAC, and secret storage checklists in the Install section prior to onboarding real customers.

Where to go next

Install & configure – Deep dives on system requirements, TLS, and deployment patterns live in the Install overview.
Build resilient jobs – The Build overview and advanced scheduling pages walk through orchestration patterns such as fan-out and retries.
Connect integrations – Start with the Integrations catalog for per-source quick starts.
Operate at scale – Use the Troubleshooting guide plus upcoming monitoring, backup, and security playbooks to stay ahead of incidents.
Work as a team – Share the Team benefits guide with data engineers, operators, and reviewers to align on workflows.

With these foundations in place, you can iterate quickly on pipelines while the platform handles orchestration, scaling, and observability.