Senior Site Reliability Engineer

Added
39 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

datadog aws prometheus python kubernetes

πŸ“‹ Description

  • Own and evolve observability strategy: monitoring, alerting, dashboards, logging, tracing.
  • Define and manage SLIs, SLOs, and reliability metrics.
  • Lead incident response, postmortems, and continuous improvement.
  • Improve MTTD and MTTR through automation and operational excellence.
  • Integrate observability into CI/CD pipelines and software delivery workflows.
  • Build and maintain reliable cloud infrastructure on AWS and Kubernetes.

🎯 Requirements

  • 8+ years in software engineering, infrastructure, or operations.
  • 5+ years of Site Reliability Engineering experience.
  • Deep expertise with observability platforms (New Relic, Datadog, Dynatrace, Grafana, Prometheus).
  • Strong monitoring, alerting, incident management, and reliability engineering.
  • Hands-on AWS, Kubernetes, and cloud-native tech.
  • Python, Bash, PowerShell, or similar scripting languages; excellent communication.

🎁 Benefits

  • Medical, Dental, and Vision Insurance for full-time employees.
  • Competitive pay.
  • Maternity and paternity leave for full-time staff.
  • Short and long-term disability.
  • Opportunity to learn from a dedicated leadership team.
  • Top-of-the-line company swag.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’