This job is no longer available

The job listing you are looking has expired.
Please browse our latest remote jobs.

See open jobs →
← Back to all jobs

Principal DevOps Engineer, Infrastructure Performance

Added
2 days ago
Type
Full time
Salary
Not Specified

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Save job

Upgrade helps customers move in the right direction with affordable and responsible financial products. Since 2017, we’ve helped over 7 million customers access over $40 billion in consumer credit. With a relentless focus on improving our customers' financial well-being, we build products that put more money in their pocket and support their journey toward a better financial future. We’re backed by some of the most prominent technology investors and were most recently valued at $6.3B.

We’re consistently recognized for our collaborative and inclusive culture. Most recently, we were named one of the World’s Top Fintech Companies by CNBC, Best Places to Work by Built In, Best Places to Work by the San Francisco Business Times, America’s Greatest Workplaces by Newsweek, Best Startup Employer by Forbes, and Healthiest Employers by Phoenix Business Journal. 

We’re looking for new team members who get excited about designing and delivering new and better products. Come join us and help build a better financial future for millions of people.

 


What You'll Do:

  • Build a resilient, secure, and efficient cloud based observability platform.

  • Monitor and troubleshoot platform issues, including finding solutions to reduce known issues.

  • Build and scale the observability infrastructure to meet rapidly increasing demand.

  • Develop and improve operational practices and procedures.

  • Sample projects:

    • Improve database monitoring: develop custom prometheus exporters in Go for use cases that go beyond what is possible with SQL exporter. Create Grafana dashboards and alerts for these new metrics.

    • MCP servers for observability: deploy MCP server to integrate our observability stack with our LLM tools.

What We Look For:

  • 8+ years of relevant production-level experience.

  • Experience with VictoriaMetrics.

  • Experience with Sumologic.

  • Experience with tracing tools (e.g. OpenTelemetry, Honeycomb, Tempo).

  • Experience with profiling tools (e.g. Pyroscope)

  • Knowledge of cloud monitoring, logging and cost management tools.

  • Programming/scripting knowledge (Go, Java, or Python) and understanding of JVM concepts.

  • In-depth knowledge of AWS services, hands-on experience in AWS provisioning using terraform.

  • Experience with containerized applications and Kubernetes / EKS. Creating and updating / maintaining Helm charts.

  • Understanding of microservices architecture and debugging/investigation techniques.

  • Strong understanding of systems, networking and troubleshooting techniques.

  • Experience in automated build pipeline, continuous integration and continuous deployment.

  • Ability to operate in an agile, entrepreneurial start-up environment.

  • Experience with running Linux in production.

Our Tech Stack:

  • Monitoring: VictoriaMetrics, Grafana, Prometheus, OpenTelemetry, Honeycomb, Sumologic.

  • Infrastructure as code: Terraform.

  • CD: GitOps, ArgoCD, ArgoRollouts.

  • CI: Tekton.

  • Scripting: Bash.

  • Programming: Golang (preferred).

  • AWS: EKS, Cloudwatch, S3, DynamodDB, RDS, SNS, SQS, Lambda.

What We Offer You: 

  • Competitive salary and stock option plan.

  • 100% paid coverage of medical, dental and vision insurance.

  • Flexible PTO.

  • Learning stipend for personal growth and development. 

  • Paid parental leave.

  • Health & wellness initiatives. 

#LI-Remote  #BI-Remote

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Upgrade does not accept unsolicited resumes from staffing agencies, search firms, or any third parties. Any resume submitted to any employee of Upgrade without a prior written agreement in place will be considered the property of Upgrade, and Upgrade will not be obligated to pay any referral or placement fee. Agencies must obtain advance written approval from Upgrade's Talent Acquisition department to submit resumes and only in conjunction with a valid, fully executed agreement. English is required for all positions, as it involves interacting with staff at Upgrade's offices worldwide.

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Remote Engineering Jobs. Just set your preferences and Job Copilot will do the rest—finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →