Related skills
datadog terraform github actions python kubernetesπ Description
- Define and maintain SLOs/SLAs balancing UX with velocity
- Implement monitoring and alerting in Datadog for prod issues
- Build resilient architectures that gracefully handle failures
- Establish error budgets for feature velocity vs stability
- Lead incident response as primary on-call for infrastructure
- Conduct blameless post-mortems to prevent recurrence
π― Requirements
- Terraform/Infrastructure as Code (Terraform or CloudFormation)
- Kubernetes expertise: networking, storage, security contexts
- Python or Go for tooling and automation
- CI/CD pipelines and deployment strategies (GitHub Actions, GitLab CI, Jenkins)
- Observability: Datadog, Prometheus, Grafana
- Linux/Unix admin + cloud providers (AWS, GCP, Azure)
π Benefits
- 401K and health/dental/vision benefits
- Flexible remote work with home office stipend
- Company laptop and role-specific tech
- Hybrid NYC office with amenities
- Competitive PTO and team socials
- Unlimited professional development fund
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!