Added
2 days ago
Type
Full time
Salary
Salary not provided

Related skills

datadog java postgresql mysql grafana

πŸ“‹ Description

  • We are seeking a Site Reliability Engineer to ensure systems run smoothly and scale reliably.
  • Participate in on-call rotations; triage issues; act as Incident Commander.
  • Create dashboards and alerts; instrument code for visibility with development teams.
  • Automate toil to improve reliability across apps and infra.
  • Implement and track SLIs and SLOs; investigate escalated reliability issues.

🎯 Requirements

  • 3+ years as SRE or similar; proficient in Java, Go, or Python.
  • Strong understanding of distributed systems and microservices.
  • Hands-on Kubernetes; managed apps on GCP/AWS/Azure; troubleshoot.
  • Grafana dashboards; APM tools (Datadog, New Relic, Signoz); metrics/logs/traces.
  • SQL proficiency (PostgreSQL, MySQL); write complex queries; DB performance.

🎁 Benefits

  • Culture - People-first, inclusive environment.
  • Learning - Regular internal talks and knowledge sharing.
  • Compensation - Salary, pension, health insurance, annual bonus.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’