Related skills
aws sql python distributed systems google cloud platformπ Description
- Improve reliability and performance of ClickHouse core.
- Create metrics and alerts to detect prod issues early.
- Identify root causes; submit bug fixes and improvements.
- Refine incident response and postmortem processes; coordinate with teams.
- Plan and drive Chaos initiatives across engineering.
- Manage on call processes and establish escalation best practices.
π― Requirements
- Bachelors or Masters in Computer Science or related field.
- 5+ years in Reliability Engineering, QA or customer-facing engineering.
- Production experience operating ClickHouse or other SQL databases.
- Strong understanding of distributed DB internals and SQL (ClickHouse a plus).
- Scripting with Shell or Python; ability to read C++ code.
- Cloud platforms AWS, Azure, Google Cloud Platform.
π Benefits
- Flexible work environment; remote-friendly worldwide.
- Healthcare contributions toward coverage.
- Equity via stock options.
- Flexible time off in the US; generous elsewhere.
- $500 home office setup for remote staff.
- Global gatherings and offsites.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!