Storage Reliability Engineer

Related skills

linux s3 kubernetes go observability

πŸ“‹ Description

  • Operate and support mission-critical storage systems powering AI workloads.
  • Triage incidents, debug across app, system, and kernel layers; contribute fixes.
  • Turn production learnings into reliability improvements via tooling and automation.
  • Develop observability and tooling to boost reliability and performance.
  • Collaborate with internal teams and customers to diagnose deployment issues.
  • Hands-on with production infrastructure in a cross-functional role.

🎯 Requirements

  • Bachelor's degree in Computer Science, Engineering, or equivalent
  • 5+ years storage or distributed infra in production
  • Strong debugging across user/kernel space; core dumps experience
  • Kubernetes and CSI drivers hands-on experience
  • NFS or S3 storage protocols experience
  • Go or similar systems language proficiency

🎁 Benefits

  • Medical, dental, and vision insurance - 100% paid
  • Company-paid Life Insurance
  • Short and long-term disability insurance
  • 401(k) with generous employer match
  • Flexible PTO
  • Paid Parental Leave
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’