Software Engineer, Data Infrastructure

Added
1 day ago
Type
Full time
Salary
Salary not provided

Related skills

python kafka spark iceberg flink

📋 Description

  • Build and evolve the data systems powering Cursor’s product.
  • Design and operate large-scale batch data systems with Spark and Ray Data.
  • Scale data ingestion pipelines to billions of rows per day.
  • Re-architect storage for prompts/models focusing on cost, performance, and usability on S3.
  • Build and maintain streaming data infrastructure (Kafka, Flink, or similar).
  • Work across warehouses and lakehouse formats like Iceberg/Delta Lake.
  • Improve data developer experience for Python-heavy workflows.
  • Support replication and change data capture pipelines (DMS, Debezium).

🎯 Requirements

  • Deep experience with Spark (Databricks or open-source)
  • Production experience with Ray Data
  • Ownership of large data pipelines and storage systems
  • Comfort debugging performance across compute, storage, networking
  • Clear thinking about data modeling and maintainability
  • Experience running or scaling ClickHouse
  • Familiarity with dbt, Dagster, or similar tooling
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Data Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Data Jobs

See more Data jobs →