Related skills
nlp python pytorch data pipelines scikit-learn📋 Description
- Define evaluation methodology for content classification; require model cards before ship.
- Lead multiclass refactor: convert binary classifiers to multi-label, multi-class categories (Adult Content, Violence, Self-Harm, Social Media).
- Build labeled evaluation datasets; address imbalance; document dataset curation decisions.
- Connect offline evaluation to production monitoring; surface classification drift and error patterns early.
- Investigate misclassifications; produce root-cause analyses and corrective actions.
- Build training data pipelines: ingestion, cleaning, labeling, and versioning at scale.
🎯 Requirements
- Multi-label/multi-class ML with robust evaluation; handle class imbalance.
- Python ML stack: scikit-learn, PyTorch or TF, pandas, numpy.
- NLP features for web content: URL tokenization, domain analysis, TF-IDF or embeddings.
- Rigorous evaluation: precision/recall tradeoffs, confusion matrices, A/B testing.
- Data engineering for ML: training pipelines, data versioning, labeling workflows.
- Technical communication with engineers and leadership; drive stakeholder influence.
🎁 Benefits
- Comprehensive Health Insurance (employee, parents, spouse, children)
- Accidental & Term Life Insurance
- Learning & Development reimbursement
- Paid Time Off
- Public Holidays (10+ per year)
- Retirement Benefits (EPF & gratuity)
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!