Skip to content

Twilio Segment

Lyft Data supports ingesting data exported by Twilio Segment via object storage (S3, Azure Blob, GCS) or the Segment HTTP delivery API.

Configure Lyft Data to read Segment exports from Amazon S3

Segment warehouses frequently mirror data into S3. Configure the s3 input:

  • bucket-name – Segment export bucket.
  • object-names – prefix where Segment writes files (for example segment/events/).
  • modelist-and-download to stream files as they arrive.
  • include-regex – match .json.gz or .csv.gz depending on export format.
  • preprocessors – include extension (or gzip) so gzip-compressed files are decompressed.
  • fingerprinting – avoid reprocessing already ingested files.

Example: ingest Segment JSON exports from S3

input:
s3:
bucket-name: segment-prod
object-names:
- exports/
mode: list-and-download
include-regex:
- "\\.json(\.gz)?$"
fingerprinting: true
timestamp-mode: last-modified
access-key: ${secrets.segment_s3_access_key}
secret-key: ${secrets.segment_s3_secret_key}
preprocessors:
- extension

Configure Lyft Data to read Segment exports from Azure Blob or GCS

The same pattern applies to azure-blob and gcs inputs. Substitute the appropriate credentials (storage-account/storage-master-key or GcsCredentials) and field names (container-name, blob-destination, etc.) following the examples in the Azure and GCS guides. Use include-regex to capture .json.gz exports and enable fingerprinting to dedupe.

Configure Lyft Data to receive Segment events via HTTP

For real-time delivery, Segment can send events to an HTTP endpoint. Configure the http-poll input when polling Segment APIs, or use http-server when Segment posts events to Lyft Data.

Example: poll the Segment delivery API

input:
http-poll:
url: https://api.segmentapis.com/export/v1/jobs/{job_id}/download
method: GET
headers:
Authorization: "Bearer ${secrets.segment_token}"
json: true
trigger:
interval:
duration: 15m
retry:
max-attempts: 5

Tune the URL, authentication headers, and triggers based on the Segment API being used (Batch/Delivery). Enable json: true to parse Segment payloads automatically.