AI Data Infrastructure Engineer

VirtualVocations
Los Angeles, US
Remote

Why this role

Pace
Steady
Collaboration
Medium
Autonomy
Medium
Decision Impact
Team
Role Level
Individual Contributor

Derived from job-description analysis by Serendipath's career intelligence engine.

What success looks like

  • successful deployment of large-scale data pipelines
  • high data quality and integrity
Typical background
6+ years of data engineering experiencedegree in Computer Science or related field

Transferable backgrounds

  • Coming from data engineering
  • Coming from AI infrastructure

Skills & requirements

Required

Large-scale Data SystemsAI Training And Evaluation PipelinesData CleaningPetabyte-scale Storage

Preferred

Data VisualizationCloud Computing

Stack & domain

PythonJvm Or Systems LanguageSparkRayBeamPetabyte-scale Storage And Pipeline SystemsAIData InfrastructureData EngineeringMachine LearningData Processing FrameworksPetabyte-scale Storage

About the role

Original posting from VirtualVocations

AI Data Infrastructure Engineer, a full-time remote position requiring over six years of experience, focused on building and operating large-scale data systems for AI training and evaluation pipelines. Key Responsibilities Design and operate large-scale data pipelines supporting AI training and evaluation workflows Build ingestion systems for various data modalities including text, image, and audio Implement data cleaning and quality assurance processes at petabyte scale Required Qualifications Bachelor's or Master's degree in Computer Science or a related field Six or more years of data engineering experience, particularly with ML or AI workloads Strong proficiency in Python and at least one JVM or systems language Deep experience with modern data processing frameworks such as Spark, Ray, or Beam Hands-on experience with petabyte-scale storage and pipeline systems

Source: VirtualVocations careers

Similar roles