AI Data Infrastructure Engineer

VirtualVocations

Washington, US

RemoteCareer-pivot friendly

Why this role

Pace

Steady

Collaboration

Medium

Autonomy

Medium

Decision Impact

Individual

Role Level

Individual Contributor

Career Pivot Friendly

Welcomes transferable skills

Derived from job-description analysis by Serendipath's career intelligence engine.

What success looks like

large-scale data infrastructure
AI training and evaluation pipelines

Typical background

data engineeringsoftware engineering

Transferable backgrounds

Coming from data scientist
Coming from software engineer

Skills & requirements

Required

Data-pipelinesData-quality-assuranceLarge-scale-systemsPythonData-processing-frameworks

Preferred

Cloud-computingMachine-learning

Stack & domain

PythonSparkRayBeam

About the role

Original posting from VirtualVocations

AI Data Infrastructure Engineer, a full-time remote position requiring over six years of experience, focused on building and operating large-scale data systems for AI training and evaluation pipelines. Key Responsibilities Design and operate large-scale data pipelines supporting AI training and evaluation workflows Build ingestion systems for various data modalities including text, image, and audio Implement data cleaning and quality assurance processes at petabyte scale Required Qualifications Bachelor's or Master's degree in Computer Science or a related field Six or more years of data engineering experience, particularly with ML or AI workloads Strong proficiency in Python and at least one JVM or systems language Deep experience with modern data processing frameworks such as Spark, Ray, or Beam Hands-on experience with petabyte-scale storage and pipeline systems

Source: VirtualVocations careers