Staff Site Reliability Engineer

Added
8 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

ansible linux grafana prometheus python

πŸ“‹ Description

  • Manage large-scale distributed systems using infrastructure-as-code
  • Develop and enhance automation tools for deployment of large-scale services
  • Diagnose and resolve issues by editing code and infrastructure configurations
  • Develop automation solutions and manage services with version-controlled IaC
  • Conduct performance and network analysis; create reusable tools
  • Support mission-critical services and participate in on-call rotations

🎯 Requirements

  • 5+ years of site reliability or systems engineering
  • Proficiency in Python or Ansible for automation; API interactions
  • Demonstrated experience building and maintaining automation solutions
  • Strong Linux/system administration background
  • Bachelor's degree in CS or related field, or equivalent practical experience
  • PXE-based kickstart; monitoring with Prometheus, Grafana, Nagios

🎁 Benefits

  • Various health plans
  • Time off for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks, and more
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’