Senior ML Platform Engineer II

Added
5 hours ago
Type
Full time
Salary
Salary not provided

Related skills

java terraform redis python pytorch

๐Ÿ“‹ Description

  • Design, build, and operate standardized training-to-serving pipelines with Airflow for SageMaker endpoints.
  • Real-time and batch inference on SageMaker: multi-model endpoints, autoscaling.
  • Ultra-low-latency serving with Redis/Valkey: feature caching and online retrieval.
  • Provision ML infrastructure with Terraform: SageMaker endpoints, ECR/ECS/EKS, VPC, IAM.
  • Build platform abstractions and golden paths: Airflow DAGs, CLI/SDKs, CI/CD pipelines.
  • Govern model lifecycle: registries, approvals, lineage, and audits.

๐ŸŽฏ Requirements

  • 5+ years building production-grade ML/data platforms.
  • Strong software engineering in Python, Go, or Java with APIs and tooling.
  • Deep experience with AWS SageMaker inference: endpoint config, containerization, autoscaling.
  • Expertise with online feature stores like Redis/Valkey for ML serving.
  • Terraform experience managing ML and data infra end-to-end (GitOps preferable).
  • Airflow orchestration at scale: DAGs, sensors, retries, SLAs, backfills.

๐ŸŽ Benefits

  • Team lunches and game nights
  • Company-wide events and socials
  • Hybrid in-office model in Canada (Toronto & Montreal)
  • Growth-focused culture with data-driven learning
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’