Skip to content

Google Analytics (GA4)

GA4

Lyft Data supports importing Google Analytics 4 exports that land in Google Cloud Storage or Amazon S3.

Configure Lyft Data to read GA4 exports from Google Cloud Storage

Add the gcs input to a job. Common fields:

  • bucket-name – GA4 export bucket (required).
  • object-names – prefix pointing at the export folder (e.g., analytics_123456/events/).
  • modelist-and-download to enumerate daily files.
  • include-regex – narrow results to GA4 export format, typically "\\.parquet$".
  • timestamp-modelast-modified to process newest exports first.
  • fingerprinting – deduplicate files across reruns.
  • credentials – GA4 exports live in GCP; use a service account with roles/storage.objectViewer.

Example: ingest daily parquet exports from GCS

input:
gcs:
bucket-name: analytics-prod
object-names:
- analytics_123456/events/
mode: list-and-download
include-regex:
- "\\.parquet$"
maximum-age: 3d
fingerprinting: true
timestamp-mode: last-modified
credentials:
service-account:
key: ${secrets.ga4_gcs_reader}
preprocessors:
- parquet

Configure Lyft Data to read GA4 exports from Amazon S3

If GA4 exports are mirrored to S3, use the s3 input:

  • bucket-name – destination bucket.
  • object-names – export prefix (for example ga4/events/).
  • modelist-and-download.
  • include-regex – match .parquet or .json.gz depending on the export.
  • access-key / secret-key – credentials with list/get access.
  • preprocessors – include parquet or extension so events are decoded into JSON.

Example: ingest GA4 parquet exports from S3

input:
s3:
bucket-name: analytics-s3-mirror
object-names:
- ga4/events/
mode: list-and-download
include-regex:
- "\\.parquet$"
maximum-age: 3d
fingerprinting: true
timestamp-mode: last-modified
access-key: ${secrets.ga4_s3_access_key}
secret-key: ${secrets.ga4_s3_secret_key}
preprocessors:
- parquet