Senior Site Reliability Engineer, Wikimedia Enterprise

Added
28 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

gitlab ansible terraform aws prometheus

πŸ“‹ Description

  • Define SLOs/SLIs and error budgets for APIs.
  • Build observability: metrics, logs, tracing.
  • Drive capacity planning, load testing, resilience validation.
  • Improve developer experience with self-service infra.
  • Collaborate across teams and participate in on-call.

🎯 Requirements

  • IaC automation: Terraform, Ansible; Python or Go.
  • Cloud infra on AWS/Azure/GCP; scalable, reliable, cost-efficient.
  • CI/CD and GitOps: GitLab, ArgoCD; progressive delivery.
  • Incident management: on-call, postmortems, continuous improvement.
  • SRE & observability: SLOs/SLIs; metrics, logs, tracing (Prometheus/OpenTelemetry).
  • Collaborate in distributed teams; strong documentation.

🎁 Benefits

  • Remote-first with staff in 40+ countries
  • Global, diverse, inclusive workplace
  • Disability accommodations during hiring
  • Competitive, equitable benefits package
  • U.S. Benefits & Perks
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’