Overview
Speechify is seeking a Software Engineer, Data Infrastructure & Acquisition to join our team in Penang, Malaysia. In this role, you will design, build, and maintain data infrastructure and data ingestion pipelines to support analytics, ML initiatives, and product features.
Responsibilities
- Design, implement, and maintain data ingestion pipelines from internal and external sources
- Build and optimize data infrastructure including data warehouses and data lakes
- Collaborate with data science, analytics, and product teams to understand data needs
- Improve data quality, reliability, and performance; monitor pipelines
- Write clean, well-tested code; document data workflows
- Ensure security, privacy, and compliance for data processing
Requirements
- Experience building data pipelines and working with large-scale datasets
- Proficiency in Python and SQL
- Experience with ETL/ELT processes, data ingestion, and data modeling
- Familiarity with cloud platforms (e.g., AWS)
- Experience with data orchestration tools (e.g., Apache Airflow)
- Version control with Git; strong problem-solving and communication
- Bachelor's degree in Computer Science or a related field
Nice-to-have
- Experience with Spark, Hadoop, Snowflake, Redshift, or other modern data platforms
- Experience with streaming data and real-time processing
About Speechify
Speechify helps people unlock the power of multi-sensory reading and learning through advanced text-to-speech technology. This position is based in Penang, Malaysia, and offers an opportunity to impact data-driven decision making across the company.