Related skills
mlops observability llm๐ Description
- Lead and scale the Infrastructure/Core Platform domain across multiple teams.
- Enable AI-first delivery with foundations for LLM workloads and data pipelines.
- Set technical/operational vision balancing reliability, velocity, security, cost.
- Partner with Product, Security, and Engineering Leadership to align platform investments.
- Drive operational excellence: availability, incident mgmt, capacity, DR, and SLOs.
- Establish platform standards for CI/CD, observability, and infrastructure-as-code.
๐ฏ Requirements
- 5+ years of engineering management experience, leading managers and multiple teams.
- Strong experience as software/platform/infrastructure engineer with deep technical strategy.
- Proven experience owning production infra at scale: cloud, reliability, security, and developer enablement.
- Bonus: AI/ML exposure (LLM workloads, data pipelines, MLOps, evaluation frameworks).
- Data-driven decision making across reliability, cost, capacity, and delivery.
- Excellent communication; able to influence across Product, Security, and Engineering.
๐ Benefits
- Health (medical, vision, dental), life, and disability insurance
- Equity stock options
- Retirement plans
- Paid public holidays and unlimited PTO
- Paid maternity and parental leave
- Leaves of absence (including caregiver leave and CO's Healthy Families and Workplaces Act)
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!