Machine Learning with AWS Sagemaker -- DWIDC5532895
Compunnel Inc.
Toronto, CA; US
Job Description
Model Development & Training: Building and refining ML models using frameworks like TensorFlow, PyTorch, and Scikit-learn within SageMaker Studio.
Data Engineering & Labeling: Designing automated data pipelines and managing high-quality datasets using tools like SageMaker Ground Truth and SageMaker Data Wrangler.
Operationalizing ML (MLOps): Implementing CI/CD for machine learning through SageMaker Pipelines, automating model retraining, and managing model versions in the SageMaker Model Registry.
Deployment & Inference: Deploying models for real-time or batch inference and managing multi-model endpoints to ensure low latency and high availability.
Performance Monitoring: Using SageMaker Model Monitor and Clarify to track model quality, detect bias, and identify feature drift in production.
Optimization: Tuning hyperparameters and optimizing training costs using Managed Spot Training and distributed training libraries.
Essential Skills & Qualifications
AWS Expertise: Proficiency in Amazon SageMaker and related services such as S3, Lambda, IAM, and Step Functions.
Programming: Strong command of Python (specifically the SageMaker Python SDK) or R, and SQL.
ML Frameworks: Deep experience with modern libraries including PyTorch, TensorFlow, and XGBoost.
Mathematical Foundation: Solid understanding of statistics, linear algebra, and predictive modeling.
Cloud Infrastructure: Experience managing compute clusters, VPCs, and ensuring security best practices.