QA/Validation Engineer with AI/ML

Arkhya Tech. Inc.

Toronto, CA; US

Hybrid

Job Description

QA / Validation Engineer – Agentic AI & Machine Learning (2 Positions)

Toronto (Hybrid)

FTE or 6 months CTH

We are seeking a QA / Validation Engineer to assure the quality, safety, and reliability of Machine Learning and Agentic AI solutions (LLM/RAG/tool-using agents) from development through production. This is a hands-on engineering role focused on designing test strategies, building automated evaluation pipelines, and implementing quality gates for data, models, prompts, tools, and end-to-end agent workflows. You will work primarily in Python and leverage an open-source AI/ML stack, collaborating closely with ML/GenAI engineers, data engineering, and platform teams in environments that may include Databricks and Spark.

Required Skills & Experience

Strong QA engineering experience with a focus on AI/ML systems, including validation strategies beyond traditional functional testing.
Strong Python skills (test design, automation frameworks, packaging, code quality, and performance awareness).
Experience validating ML models and pipelines: dataset splits, leakage checks, metric selection, thresholding, and regression testing.
Hands-on familiarity with an open-source AI stack (examples: scikit-learn, PyTorch/TensorFlow, XGBoost/LightGBM, Hugging Face ecosystem).
Experience testing GenAI/LLM/agentic systems: prompt/version management, evaluation harnesses, and quality metrics for non-deterministic outputs.
Understanding of RAG concepts (embeddings, vector search, retrieval, reranking) and how to evaluate them.
Working knowledge of MLOps/LLMOps practices: experiment tracking, model/prompt versioning, reproducibility, and monitoring (e.g., MLflow or equivalent).
Experience with CI/CD, containerization (Docker), and test reporting; ability to integrate evaluations into automated pipelines.
Strong data skills: SQL fundamentals and experience with data analysis/validation using pandas/NumPy.
Clear communication and stakeholder management—able to translate quality risks into actionable engineering work.

Skills & Requirements

Technical Skills

PythonScikit-learnPytorchTensorflowXgboostLightgbmHugging faceDockerMlflowLeadershipCommunicationAiMl

Employment Type

FULL TIME

Level

mid

Posted

4/6/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.