Senior Site Reliability Engineer

Added
3 hours ago
Type
Full time
Salary
Salary not provided

Related skills

ansible terraform helm aws prometheus

πŸ“‹ Description

  • Maintain and harden AWS infrastructure (EC2, ALB/NLB, WAF, IAM, CloudWatch)
  • Operate and evolve our EKS clusters powering Python-based AI services
  • Migrate existing services to Kubernetes using Terraform and Helm
  • Codify infrastructure with Terraform and manage host-level automation via Ansible
  • Build and improve CI/CD pipelines with GitHub Actions
  • Own observability efforts: Prometheus, Grafana, alerting, and on-call readiness

🎯 Requirements

  • 5+ years of experience managing Linux in production (Ubuntu, Amazon Linux)
  • Strong experience with Kubernetes (ideally EKS), Helm, and Terraform
  • Comfort with running and debugging Python workloads in containers
  • Solid understanding of networking, IAM, and cloud security best practices
  • Hands-on Nginx experience (Ingress and reverse proxy setups)
  • Excellent communication skills; you can explain complex infra to devs clearly

🎁 Benefits

  • Hybrid onboarding to start work remotely and relocation support for you and your family
  • Comprehensive health insurance for both you and your family
  • Professional development budget for conferences, courses, and resources
  • Flexible benefits package to tailor perks that matters most for you
  • Hybrid work and generous leave options to prioritize your work-life balance
  • In-office perks, including free meals and snacks

🚚 Relocation support

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’