Related skills
azure aws sql reliability engineering distributed databasesπ Description
- Continuously improve the reliability and performance of ClickHouse core.
- Improve metrics and alerts to detect production issues early.
- Investigate top customer issues, root causes, and file fixes.
- Refine incident response and post-mortems; coordinate with teams.
- Plan and drive Chaos initiatives across Engineering.
- Manage on-call processes and establish best practices.
π― Requirements
- Bachelor's or Master's in Computer Science or related field.
- 5+ years in reliability engineering, QA, or customer-facing eng.
- Experience operating ClickHouse or SQL databases in production.
- Strong understanding of distributed DB internals and SQL; ClickHouse a plus.
- Scripting with Shell or Python; reading C++ code.
- Knowledge of AWS, Azure, or Google Cloud Platform.
- Strong problem-solver with solid production debugging skills.
- Thrives in a fast-paced global team and partners with the business.
- Shows high responsibility, ownership, and accountability.
- Excellent communication skills.
π Benefits
- Flexible work environment; remote-friendly and global reach.
- Healthcare - Employer contributions towards healthcare.
- Equity in the company - Stock options for every new employee.
- Time off - Flexible US time off; generous elsewhere.
- A $500 Home office setup - for remote employees.
- Global Gatherings β company-wide offsites.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!