Staff Site Reliability & DevOps Engineer

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

terraform linux grafana prometheus kubernetes

πŸ“‹ Description

  • Design, operate, and evolve observability platforms using Grafana and Prometheus
  • Define dashboards, alerts, metrics standards, and SLOs
  • Reduce alert noise; tune thresholds and runbooks
  • Support incident response with actionable telemetry and post-incident analysis
  • Instrument services and integrate metrics, logs, traces across distributed systems
  • Automate observability configuration with infrastructure as code

🎯 Requirements

  • Strong experience with Prometheus (scraping, federation, recording rules, alerting)
  • Strong experience with Grafana (dashboards, alerting, templating, RBAC)
  • Linux and networking fundamentals
  • Experience running observability stacks in Kubernetes environments
  • Infrastructure as code experience (Terraform preferred)
  • Familiarity with incident management and on-call practices

🎁 Benefits

  • Inclusive, diverse workplace and belonging
  • Global team collaboration across regions
  • Growth-focused culture and innovation
  • Work with award-winning PR and analytics tech
  • Commitment to accessibility and accommodations
  • Equal opportunity employer
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to DevOps Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related DevOps Jobs

See more DevOps jobs β†’