Define architectural direction for AI Platform training data flow and governance
Set standards for ML engineers; ensure tenant isolation and PII-free data
Own data lake design: storage tiers, CMEK, lifecycle rules
Oversee batch and streaming pipelines, catalog, and orchestration
Ensure multi-tenant isolation and data quality controls

🎯 Requirements

8+ years of data engineering; degree in CS/Engineering or equivalent
Deep PySpark/Scala expertise; Spark performance
Hands-on Apache Beam on Dataflow
Schema registry: Protobuf/Avro
Orchestration at scale: Flyte, Kubeflow Pipelines, Airflow
Multi-tenant data architecture and governance

🎁 Benefits

Work on large-scale Kubernetes-based SaaS platform
Solve challenging cloud and reliability problems
Collaborate with top reliability engineers
Competitive compensation and growth opportunities

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot