Related skills
aws sql python databricks airflowπ Description
- Design scalable data pipelines and lakehouse infra on AWS (PySpark, Databricks, Airflow).
- Improve data platform dev experience with abstractions, tooling, and docs.
- Own core data pipelines and models powering dashboards, ML, and products.
- Own Data & ML platform infra with Terraform; manage Databricks workspaces and access.
- Lead projects to improve data quality, testing, observability, and cost efficiency.
- Partner with the Data Science team to design scalable solutions and provide end-to-end support.
π― Requirements
- 4+ years of software engineering with data infra, pipelines, and distributed systems.
- Advanced proficiency in Python and SQL; hands-on Spark experience.
- Experience with AWS (S3, RDS), Databricks, and Airflow; lakehouse architectures.
- Hands-on Hadoop, Hive, Kafka (or similar streaming), Delta Lake/Iceberg, and Trino/Presto.
- Familiarity with ingestion frameworks, data versioning, lineage, partitioning, and clustering.
- Strong problem-solving, ownership, and collaboration.
π Benefits
- Equity grant
- Medical, dental & vision insurance
- Work from home flexibility
- Unlimited PTO
- 401(k)
- Paid parental leave
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!