Site Reliability Engineer

Added
18 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

ansible puppet grafana python kubernetes

πŸ“‹ Description

  • Design and maintain monitoring solutions covering metrics, logs, and traces
  • Deploy and manage monitoring infrastructure using Grafana, ICINGA2, Site24x7
  • Respond to automated alerts and production incidents
  • Participate in on-call rotations supporting global operations
  • Build automation to reduce operational overhead and improve reliability

🎯 Requirements

  • 5+ years in Site Reliability Engineering, DevOps, or infrastructure operations
  • 5+ years of Linux/Unix systems expertise and performance troubleshooting
  • 3+ years of experience with observability platforms (metrics, logging, tracing, visualization)
  • 3+ years of configuration management experience (Ansible, Puppet, or similar)
  • 3+ years of professional experience in scripting (Python, Go, Bash, JavaScript, etc.)
  • 3+ years of professional experience with event correlation or incident platforms

🎁 Benefits

  • Competitive health benefits and retirement plans
  • Diversity and inclusion commitments and ERGs
  • Flexible remote work options with occasional office visits
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’