Sr. Staff Production Engineer

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

azure ansible terraform aws python

πŸ“‹ Description

  • Design and implement scalable infra across AWS, Azure, GCP, and bare-metal
  • Drive automation-first culture using Python/Go to build self-healing systems
  • Implement observability with Prometheus, Grafana, OpenTelemetry; define SLIs/SLOs
  • Lead incidents as Incident Commander; develop response playbooks; post-incident analyses
  • Collaborate with Engineering and partner teams for operability reviews

🎯 Requirements

  • 8+ years of reliability, scalability, and availability for large-scale services
  • Deep expertise in Python, Go, or C/C++
  • Strong background in networking, Linux/FreeBSD, and distributed architecture
  • Experience in high-stakes incident management and 24/7 on-call rotation
  • ITIL framework experience and data-driven operability improvements

🎁 Benefits

  • Various health plans
  • Time off for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks, and more!
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’