Related skills
bigquery github actions python sharepoint online copilot studioπ Description
- Design and maintain business test suites for Master Agent and domain agents.
- Build evaluation datasets (PL/EN) with edge cases.
- Perform response quality evaluation using metrics: accuracy, top-k recall, groundedness.
- Conduct PII and compliance testing: masking, anonymization, sensitive data handling.
- Test guardrails: undesired output handling, prompt security, and "I donβt know" policy enforcement.
- Validate conversational UX and test integrations with Copilot Studio, Azure AI Search, Azure OpenAI / Foundry, Document Intelligence, SharePoint Online.
π― Requirements
- Experience testing LLM-based or agent systems, or strong QA with AI focus.
- Ability to design test scenarios, cases, and evaluation datasets.
- Basic Python skills (pandas, REST APIs, simple scripts).
- Familiarity with Copilot Studio and domain-agent integrations.
- Basic knowledge of Azure AI Search, SharePoint Online, and Document Intelligence.
- Understanding automated evaluation methods (LLM scoring, benchmarks).
π Benefits
- Flexible collaboration model based on a B2B contract.
- Opportunity to work on diverse projects.
- Hybrid work: 3 days/week on-site in Warsaw.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!