Staff Site Reliability Engineer

Added
16 days ago
Type
Full time
Salary
Salary not provided

Related skills

node.js azure terraform aws prometheus

πŸ“‹ Description

  • Engage with teams to improve reliability across the lifecycle.
  • Monitor production systems for availability, latency, health.
  • Identify root causes and drive better reliability in cloud services.
  • Collaborate with product and platform teams to boost observability.
  • Reduce toil through automation and innovative tooling.
  • On-call and off-hours duties are required.

🎯 Requirements

  • Observability for cloud platforms design/operation.
  • Terraform (preferred) or Ansible; cloud SDKs.
  • AWS and Azure; containers and orchestration (Kubernetes).
  • Observability tools: New Relic, Splunk, CloudWatch, Prometheus.
  • JavaScript/Node.js/TypeScript development on Linux/macOS.
  • Blameless incident response and on-call readiness.

🎁 Benefits

  • Remote-first with distributed team.
  • Flexible schedule and work-life balance.
  • Opportunities for growth and learning.
  • Collaborative and playful culture.
  • Inclusive workplace and global colleagues.
  • Access to Cribl's learning resources.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’