Data Engineer – AI/ML Data Infrastructure

Yogi Careers

Boston, US

Job Description

Overview:

We’re looking for a Data Engineer to build and maintain the data infrastructure that powers machine learning initiatives. You’ll work at the intersection of software engineering and data science.

Responsibilities:

Develop and maintain feature stores and ML-ready datasets
Automate data preprocessing pipelines for ML model training and evaluation
Collaborate with ML engineers to enable scalable experimentation workflows
Monitor and improve data reliability, lineage, and reproducibility

Requirements:

BS/MS in Computer Science, Data Engineering, or similar
Experience with ML platforms (Databricks, AWS Sagemaker, Vertex AI)
Strong Python and SQL skills, with familiarity in Spark or Dask
Experience with Airflow, MLflow, or Kubeflow pipelines
Solid understanding of MLOps, data validation, and model versioning

Job Category: Data Engineer

Job Type: Full Time

Job Location: Boston

Apply for this position

Full Name *

Email *

Phone *

Cover Letter *

Upload CV/Resume *Allowed Type(s): .pdf, .doc, .docx

By using this form you agree with the storage and handling of your data by this website. *

Responsibilities:

Develop and maintain feature stores and ML-ready datasets
Automate data preprocessing pipelines for ML model training and evaluation
Collaborate with ML engineers to enable scalable experimentation workflows
Monitor and improve data reliability, lineage, and reproducibility

Requirements:

BS/MS in Computer Science, Data Engineering, or similar
Experience with ML platforms (Databricks, AWS Sagemaker, Vertex AI)
Strong Python and SQL skills, with familiarity in Spark or Dask
Experience with Airflow, MLflow, or Kubeflow pipelines
Solid understanding of MLOps, data validation, and model versioning

Job Category: Data Engineer

Job Type: Full Time

Job Location: Boston

Responsibilities:

Develop and maintain feature stores and ML-ready datasets
Automate data preprocessing pipelines for ML model training and evaluation
Collaborate with ML engineers to enable scalable experimentation workflows
Monitor and improve data reliability, lineage, and reproducibility

Requirements:

BS/MS in Computer Science, Data Engineering, or similar
Experience with ML platforms (Databricks, AWS Sagemaker, Vertex AI)
Strong Python and SQL skills, with familiarity in Spark or Dask
Experience with Airflow, MLflow, or Kubeflow pipelines
Solid understanding of MLOps, data validation, and model versioning

Skills & Requirements

Technical Skills

PythonSqlSparkDaskAirflowMlflowKubeflowMlopsData validationModel versioningData engineeringMachine learningData science

Employment Type

FULL TIME

Level

Mid-Level

Posted

4/24/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.