Overview
CoreWeave is seeking a Senior Systems Engineer, OS Automation to own and advance OS automation across our fleet of GPU-accelerated infrastructure. This is a hands-on role with multiple hub locations (Livingston, NJ; New York City, NY; Sunnyvale, CA; Bellevue, WA) or a flexible arrangement based on business needs.
Responsibilities
- Design, implement, and maintain OS automation pipelines for provisioning, configuration management, and lifecycle management of servers across data centers and cloud environments.
- Create and maintain golden OS images and baseline configurations; manage patching, security updates, and compliance.
- Develop tooling using Python, Bash, and industry-standard configuration management tools (Ansible, Puppet, Chef).
- Build infrastructure-as-code for OS-related components (Packer, Terraform, etc.).
- Collaborate with SRE, DevOps, and security to improve reliability, security, and compliance.
- Enhance monitoring, logging, alerting, and incident response through automation; contribute to CI/CD workflows for OS updates.
- Document processes, standards, runbooks, and best practices.
- Participate in on-call rotation and incident response as needed.
Requirements
- 5+ years of systems engineering or SRE experience with OS automation.
- Strong Linux experience (RHEL/CentOS/Ubuntu) and automation tooling.
- Proficiency in scripting languages (Python, Bash) and data-driven approaches.
- Experience with configuration management tools (Ansible, Puppet, Chef) and CI/CD pipelines.
- Experience with virtualization/containerization (Docker, Kubernetes) and cloud platforms (AWS, GCP, Azure).
- Familiarity with OS image creation (Packer) and provisioning tools.
- Strong problem-solving, communication, and teamwork skills.
Nice to have
- Experience in GPU compute environments or data center operations.
- Security-focused with patch management and vulnerability scanning.
- Knowledge of security standards and compliance.
About CoreWeave
CoreWeave provides scalable GPU cloud infrastructure for AI research and production workloads. This role offers an opportunity to shape the OS automation strategy for a fast-growing cloud provider.