Staff Site Reliability & DevOps Engineer

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

terraform linux grafana prometheus kubernetes

πŸ“‹ Description

  • Design, build, and operate observability platforms using Grafana and Prometheus.
  • Define and maintain metrics standards, dashboards, alerts, and SLOs.
  • Improve signal quality: reduce alert noise, tune thresholds, and runbooks.
  • Support incident response with actionable telemetry and post-incident analysis.
  • Instrument services and automate observability using infrastructure as code.
  • Collaborate with platform, infrastructure, and application teams.

🎯 Requirements

  • Strong experience with Prometheus: scraping, federation, alerting.
  • Grafana dashboards, alerting, templating, RBAC.
  • Linux and networking fundamentals.
  • Observability stacks in Kubernetes environments.
  • Infrastructure as code experience (Terraform preferred).
  • Incident management and on-call practices.

🎁 Benefits

  • Inclusive, global culture that values every voice.
  • Hybrid work model with teams across Europe.
  • Opportunities to influence and improve major brands.
  • Growth opportunities in observability and SRE.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’