Sr Staff Site Reliability Engineer

Added
3 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

docker powershell bash aws python

πŸ“‹ Description

  • Build and maintain internal LLM-powered chat infra (OpenRouter or similar).
  • Maintain highly available, scalable, secure cloud-native infra on AWS EKS.
  • Develop observability strategies: monitoring, logging, alerting.
  • Architect and optimize data pipelines for reliable data flow.
  • Drive CI/CD improvements with automated testing and release management.
  • Design and maintain Docker-based containerization for apps.

🎯 Requirements

  • 3+ years in SRE/DevOps with a focus on reliability.
  • Deep expertise with Amazon EKS: provisioning, mgmt, troubleshooting.
  • Observability with Prometheus, Grafana, ELK, or similar.
  • Data pipelines design with Kafka, Airflow, Spark.
  • CI/CD practices with Jenkins, GitLab CI, ArgoCD.
  • Docker expertise and cloud security best practices.

🎁 Benefits

  • Equal Opportunity employer committed to diversity and inclusivity.
  • Reasonable accommodations available for disabilities and religious beliefs.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’