Software Engineer - Site Reliability Engineering

Added
20 minutes ago
Type
Full time
Salary
Salary not provided

Related skills

terraform grafana prometheus python kubernetes

📋 Description

  • Automate for insight and scale: build scalable troubleshooting tools for thousands of instances.
  • Treat operations as a software problem: codify best practices into repeatable tools.
  • Design for resilience, learn from failure: own tooling for incident response.
  • Champion reliability as a product feature: define and act on SLIs and SLOs.
  • Create signals, not noise: shape observability to surface meaningful issues.

🎯 Requirements

  • Go backend tooling and automation; strong architecture and tests.
  • Apply SRE practices: define SLIs/SLOs; reduce toil via automation.
  • Promote observability, ownership, and service level objectives.
  • Troubleshoot large-scale cloud systems with confidence.
  • Deploy and manage Kubernetes apps; cluster admin is a plus.
  • Manage infrastructure with Kustomize and Terraform.

🎁 Benefits

  • Hybrid work model with flexible office options.
  • Collaborative culture focused on inclusivity and innovation.
  • Opportunity to shape reliability for Neo4j Aura and beyond.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →