Site Reliability Engineer

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

docker terraform cloudformation aws python

πŸ“‹ Description

  • Own reliability and performance of Gamma's production systems on AWS.
  • Build observability with metrics, logs, tracing, and alerts for system health.
  • Design automation to reduce toil and speed deployments.
  • Lead incident response and blameless post-mortems; fix systemic issues.
  • Partner with engineering on architecture reviews and SLO/SLI design.
  • Manage and optimize compute, networking, databases, and managed services.

🎯 Requirements

  • 5+ years in SRE/DevOps with hands-on AWS.
  • Python, Go, or TypeScript/Node.js.
  • Terraform or CloudFormation; end-to-end observability.
  • Reliability improvements via automation and monitoring.
  • Networking, distributed systems, Docker/Kubernetes, DB performance.
  • Strong incident management and debugging skills.

🎁 Benefits

  • Strong in-office culture in San Francisco.
  • Hybrid work with 4–5 days on-site; focus-based WFH.
  • Collaborative, high-impact engineering team.
  • Opportunity to shape reliability for millions of users.
  • Growth and learning opportunities.
  • Exposure to AWS, observability, and modern tooling.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to DevOps Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related DevOps Jobs

See more DevOps jobs β†’