ML Ops Engineer

Software Technology Inc.
Austin, US
Hybrid

Job Description

Hi,

This is Vamshi ,from Software Technology We have a job opening with our client for position ML Ops Engineer If you are available and looking for any new opportunities, please send me your updated resume for below position ASAP.

Job Title: ML Ops Engineer

Location: Austin, TX (Hybrid)

Duration: Longterm Contract

Looking for locals in Texas

Look for Senior hands on consultants

Key must have skills: Python, AWS SageMaker, SageMaker Pipelines, ML flow, Kubeflow, Docker, Kubernetes, Amazon EKS,CI/CD, CodePipeline, CodeBuild, MLOps, Model Registry, Model Monitoring, Drift Detection, Step Functions,Lambda,S3,CloudWatch,CloudFormation,Infrastructure-as-Code

Job Description

We are looking for an experienced MLOps Engineer to design, build, and manage scalable machine learning infrastructure on AWS. This role will drive the end-to-end operationalization of ML models — from automated training pipelines and experiment tracking to production deployment, monitoring, and continuous retraining. The ideal candidate will bridge the gap between data science and engineering, establishing robust MLOps practices that ensure reliable, repeatable, and efficient delivery of ML solutions at scale using AWS-native services and industry-leading tools like SageMaker, Kubeflow, and MLflow.

Roles & Responsibilities

Job Description

Required Skills & Qualifications

  • Proficiency in production-grade machine learning system development, deployment, and MLOps practices on AWS
  • Strong experience with Python and ML frameworks such as TensorFlow or PyTorch
  • Familiarity with containerization and orchestration tools like Docker and Kubernetes (including Amazon EKS)
  • Hands-on experience with CI/CD pipelines using AWS-native tools such as CodePipeline, CodeBuild, and CodeDeploy
  • Advanced knowledge of AWS cloud services, particularly SageMaker, Bedrock, Lambda, Step Functions, and S3
  • Expertise in MLOps tools and platforms including Kubeflow, MLflow, and AWS SageMaker Pipelines for end-to-end model lifecycle management
  • Experience with model versioning, experiment tracking, model registry, and automated retraining workflows
  • Familiarity with AWS infrastructure-as-code tools such as CloudFormation or CDK
  • Strong understanding of model monitoring, drift detection, and A/B testing in production environments
  • Strong analytical and troubleshooting skills to maintain high system reliability using CloudWatch, X-Ray, and AWS-native observability tools

Roles & Responsibilities

  • Design, implement, and manage AWS-based MLOps infrastructure to support large-scale machine learning workflows
  • Build and maintain end-to-end ML pipelines using SageMaker Pipelines, Step Functions, and Kubeflow for automated training, validation, and deployment
  • Implement model versioning, experiment tracking, and model registry practices using MLflow and SageMaker Model Registry
  • Develop and maintain CI/CD pipelines for ML models, ensuring seamless integration from development to production
  • Demonstrate hands-on expertise in Python and frameworks like TensorFlow or PyTorch, with deployment on SageMaker endpoints
  • Utilize Docker, Amazon EKS, and AWS-native CI/CD tools to streamline ML deployment and operations
  • Leverage core AWS services such as S3, EC2, Lambda, Glue, and Athena for building and scaling data and ML infrastructure
  • Deploy, manage, and optimize machine learning models in production using SageMaker real-time and batch inference endpoints
  • Implement automated model monitoring, drift detection, and retraining triggers to maintain model health in production
  • Set up A/B testing and canary deployment strategies for safe model rollouts
  • Collaborate with data scientists and engineering teams to standardize MLOps practices and enhance performance across the AWS ecosystem
  • Monitor system and model performance using CloudWatch, CloudTrail, and X-Ray, troubleshoot issues, and ensure high availability and reliability
  • Stay informed about the latest AWS service releases, MLOps best practices, and advancements in ML operations tooling

Skills To Be Evaluated On

Python,AWS SageMaker,SageMaker Pipelines,MLflow,Kubeflow,Docker,Kubernetes,Amazon EKS,CI/CD,CodePipeline,CodeBuild,MLOps,Model Registry,Model Monitoring,Drift Detection,Step Functions,CloudFormation,Infrastructure-as-Code

Thanks,

Vamshi Thangadpalli

Technical Recruiter

Email: vamshi.t@stiorg.com | Web: www.stiorg.com

https://www.linkedin.com/in/vamshi-thangadpalli-3a0415251/

100 Overlook Center, Suite 200

Princeton, NJ 08540.

Skills & Requirements

Technical Skills

PythonAWS SageMakerSageMaker PipelinesMLflowKubeflowDockerKubernetesAmazon EKSCI/CDCodePipelineCodeBuildMLOpsModel RegistryModel MonitoringDrift DetectionStep FunctionsLambdaS3CloudWatchCloudFormationInfrastructure-as-Codemachine learningcloud computing

Employment Type

CONTRACT

Level

mid

Posted

4/9/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.