QA / Validation Engineer – Agentic AI & Machine Learning (2 Positions)
Toronto (Hybrid)
FTE or 6 months CTH
We are seeking a QA / Validation Engineer to assure the quality, safety, and reliability of Machine Learning and Agentic AI solutions (LLM/RAG/tool-using agents) from development through production. This is a hands-on engineering role focused on designing test strategies, building automated evaluation pipelines, and implementing quality gates for data, models, prompts, tools, and end-to-end agent workflows. You will work primarily in Python and leverage an open-source AI/ML stack, collaborating closely with ML/GenAI engineers, data engineering, and platform teams in environments that may include Databricks and Spark.
Required Skills & Experience
- Strong QA engineering experience with a focus on AI/ML systems, including validation strategies beyond traditional functional testing.
- Strong Python skills (test design, automation frameworks, packaging, code quality, and performance awareness).
- Experience validating ML models and pipelines: dataset splits, leakage checks, metric selection, thresholding, and regression testing.
- Hands-on familiarity with an open-source AI stack (examples: scikit-learn, PyTorch/TensorFlow, XGBoost/LightGBM, Hugging Face ecosystem).
- Experience testing GenAI/LLM/agentic systems: prompt/version management, evaluation harnesses, and quality metrics for non-deterministic outputs.
- Understanding of RAG concepts (embeddings, vector search, retrieval, reranking) and how to evaluate them.
- Working knowledge of MLOps/LLMOps practices: experiment tracking, model/prompt versioning, reproducibility, and monitoring (e.g., MLflow or equivalent).
- Experience with CI/CD, containerization (Docker), and test reporting; ability to integrate evaluations into automated pipelines.
- Strong data skills: SQL fundamentals and experience with data analysis/validation using pandas/NumPy.
- Clear communication and stakeholder management—able to translate quality risks into actionable engineering work.