Staff Platform Engineer - AI Infrastructure

Added
1 minute ago
Type
Full time
Salary
Salary not provided

Related skills

terraform python kubernetes go cdk

📋 Description

  • Design and operate GPU infrastructure for model hosting and scheduling.
  • Build and scale model serving with vLLM, TensorRT-LLM, Triton for real-time inference.
  • Implement multi-model routing across modalities on shared infrastructure.
  • Own end-to-end model lifecycle: download, deploy, serve, monitor, scale.
  • Drive inference optimization: quantization, batching, caching, cold-start reduction.
  • Build self-service platforms to provision compute, storage, and model endpoints via APIs.

🎯 Requirements

  • 8+ years software engineering; 3+ years building infra platforms or ML/AI infra.
  • Deep experience with AWS, GCP, and Kubernetes.
  • Hands-on with GPU workloads and model serving (vLLM, TensorRT-LLM, Triton).
  • Proficiency in Python, Go, or C++.
  • IaC experience: Terraform, Pulumi, CDK.
  • Experience leading cross-team technical initiatives and influencing direction.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →