Derived from job-description analysis by Serendipath's career intelligence engine.
Original posting from VirtualVocations
AI Data Infrastructure Engineer, a full-time remote position requiring over six years of experience, focused on building and operating large-scale data systems for AI training and evaluation pipelines. Key Responsibilities Design and operate large-scale data pipelines supporting AI training and evaluation workflows Build ingestion systems for various data modalities including text, image, and audio Implement data cleaning and quality assurance processes at petabyte scale Required Qualifications Bachelor's or Master's degree in Computer Science or a related field Six or more years of data engineering experience, particularly with ML or AI workloads Strong proficiency in Python and at least one JVM or systems language Deep experience with modern data processing frameworks such as Spark, Ray, or Beam Hands-on experience with petabyte-scale storage and pipeline systems
Source: VirtualVocations careers