Software Engineer - Site Reliability Engineering

Added
21 minutes ago
Type
Full time
Salary
Salary not provided

Related skills

terraform github actions grafana prometheus python

📋 Description

  • Automate for insight and scale across thousands of Neo4j instances.
  • Treat operations as software: codify best practices for repeatable ops.
  • Design for resilience; own tooling behind incident response.
  • Champion reliability as a product feature with SLIs/SLOs.
  • Create signals, not noise, with an observability stack.
  • Deploy and manage apps on Kubernetes; IaC with Kustomize and Terraform.

🎯 Requirements

  • Backend tooling and automation in Go; strong Python skills welcome.
  • Apply SRE practices: define SLIs/SLOs and reduce toil through automation.
  • Observability emphasis: promote ownership and clear metrics.
  • Troubleshoot large-scale cloud systems; monitor distributed workloads.
  • Deploy on Kubernetes; cluster admin a plus; IaC with Terraform.
  • Build and maintain CI/CD pipelines with GitHub Actions.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →