Related skills
nlp python computer vision pytorch data quality📋 Description
- Own the synthetic data generation track for ABBYY’s Document AI Data team.
- Build generative pipelines for high-quality, diverse training data at scale.
- Ensure synthetic data aligns with real-world document structures.
- Develop evaluation frameworks to measure data quality and model impact.
- Collaborate with Modeling teams to assess downstream performance.
- Own end-to-end architecture from data generation to validation.
🎯 Requirements
- 5+ years of ML/AI experience, with deep generative modeling.
- Strong expertise in deep generative modeling and data quality.
- Experience with Vision-Language Models (VLMs) and NLP.
- Proficiency in Python and PyTorch for research and production.
- Experience building large-scale data pipelines and production systems.
- Ability to design robust data quality metrics and dashboards.
🎁 Benefits
- Remote and hybrid working options.
- Flexible hours to support balance across teams.
- Two paid volunteering days per year.
- Paid parental leave in all locations.
- Comprehensive medical, accidental, and life insurance.
- Generous paid time off and wellness programs.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!