Strategy & Team Leadership: Directly manage and align prioritization of DevOps, SRE, and DBRE infrastructure teams under a unified reliability strategy. Set objectives, drive execution, ensure resources focused on high-impact reliability investments.
Platform Reliability & Incident Prevention: Conduct ongoing risk assessments of Filevine's platform; identify areas of fragility and drive proactive hardening to reduce unplanned downtime.
Reliability Metrics & Reporting: Define and track uptime/availability, MTTD, MTTR, and incident frequency. Own reporting to make platform health visible to leadership and product teams.
Status Page & Incident Communication: Manage updates to status.filevine.com during reliability events; define criteria for posting incidents and ensure timely, accurate updates for customers and internal stakeholders.
Cross-Functional Alignment: Bridge SRE, Product, Engineering, and customer-facing teams to reflect reliability priorities and translate trends into actionable insights for non-technical stakeholders.
Infrastructure & Tooling: Evaluate, implement, and manage the reliability and observability stack; drive decisions on monitoring, alerting, test environments, and tooling to scale the platform.

5+ years of experience in SRE, DevOps, platform engineering, or reliability-focused product/program management in SaaS.
Software Engineering Background: Hands-on experience as a software engineer; comfortable reading code and discussing architecture.
SRE & Infrastructure Expertise: Strong understanding of site reliability principles, cloud infrastructure, database reliability, container orchestration, and modern DevOps practices.
Risk Assessment & Data Proficiency: Strong analytical skills with ability to use data sources (monitoring platforms, Pendo, Domo, Salesforce, incident logs) to prioritize reliability by business impact.
Communication Mastery: Ability to translate complex reliability data into clear narratives for leadership and cross-functional partners; experience leading incident reviews.
SDLC, Release Lifecycle Knowledge & Education: Deep understanding of SDLC, release protocols, and incident response; ability to identify high-leverage reliability investments. Education: B.S./M.S. in computer science or related field; or equivalent direct work experience with demonstrated track record in software engineering and/or SRE.

Sr Technical Product Manager

Meet JobCopilot: Your Personal AI Job Hunter