Related skills
sql python airflow kafka sparkπ Description
- Build scalable, fault-tolerant data processing (batch/streaming) for tens of TB daily.
- Design robust data solutions and transform datasets into self-service models.
- Develop pipelines ensuring data quality and resilience to imperfect sources.
- Define and maintain data mappings, transformations, and data quality standards.
- Debug low-level systems, measure performance, and optimize large production clusters.
- Influence architecture decisions and own initiatives from concept to delivery.
π― Requirements
- Strong SQL skills.
- Proficiency in Python.
- Proficiency in at least one object-oriented language.
- Experience with big data tech (HDFS, YARN, MapReduce, Hive, Kafka, Spark).
- AWS or GCP experience is a plus.
- Looker experience is a plus.
- BS in Computer Science required; MS preferred.
π Benefits
- Global mental health and financial wellness resources
- Medical, dental, and vision coverage along with life and disability insurance
- 401(k)/pension options
- Generous time off and leave policies
- Hybrid work with Cambridge office and flexible Fridays
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!