Software Engineer, Compute Infrastructure

Added
2 hours ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

networking kubernetes distributed systems observability firmware

📋 Description

  • Spin up and scale large Kubernetes clusters via automation
  • Build abstractions to unify clusters for training workloads
  • Own bare-metal bring-up and firmware upgrades at massive scale
  • Improve metrics: reduce cluster restart times and upgrade cycles
  • Integrate networking and hardware health for end-to-end reliability
  • Develop monitoring and observability to detect issues under extreme load

🎯 Requirements

  • Infrastructure/sys or distributed systems engineer in large-scale or high-availability environments
  • Kubernetes internals, cluster scaling, and containerized workloads
  • Compute infrastructure concepts and automation of cluster/data-center ops
  • Bonus: GPU workloads, firmware management, HPC
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →