Lead Data Engineer, AI

Salesforce, Inc.
Washington, US
Remote

Job Description

Salesforce is looking for a Data Engineer to join the Data & Analytics organization and help power the future of intelligent, agentic Customer Success. In this role, you'll build and scale the data infrastructure that grounds our AI agents in accurate, real-time, high-context data — bridging raw data systems with the agentic layer driving business impact. You'll serve as the lead technical individual contributor and primary architect for our retrieval systems, partnering across engineering, data science, and product to define and execute the technical vision for agentic data delivery.

What You'll Do

  • Design Search & Retrieval Systems — Build robust search indices that enable AI agents to perform complex, high-precision retrievals across the Salesforce data ecosystem.
  • Architect the Agentic Retrieval Layer — Serve as primary architect for the semantic layer and embedding pipelines that ground Agentic AI in Customer Success data.
  • Build Inference Infrastructure — Partner with Decision Scientists to develop specialized infrastructure for attribution and causal modeling.
  • Drive Operational Excellence — Set and enforce rigorous standards for data quality, latency, and index freshness so agents deliver reliable, real-time insights.
  • Lead AI Integration & Automation — Automate the data delivery pipeline, ensuring seamless integration across internal databases, third-party APIs, and the AI orchestration layer.
  • Define the Technical Roadmap — Collaborate with product managers and engineering leaders to shape the long-term vision for agentic retrieval, aligned with the broader migration to Data Cloud.
  • Mentor & Elevate the Team — Act as a technical pillar for a specialized team of data and AI engineers, fostering a culture of technical excellence and continuous growth.
  • Enable Cross-Functional Impact — Translate complex business needs from Decision Scientists, Data Scientists, PMs, and Engineering Leaders into scalable, production-ready solutions.

Required Qualifications

  • 8+ years of experience in data engineering or a closely related role.
  • Proficiency in Python, SQL, and distributed processing frameworks (e.g., Spark).
  • Hands-on experience with ETL/ELT tools such as Airflow, dbt, or Informatica.
  • Strong foundation in data modeling, database concepts, and data warehousing (SQL and NoSQL).
  • Experience with cloud data platforms (AWS, Azure, or Google Cloud).
  • Experience with the Salesforce ecosystem, including Data Cloud.
  • Proven ability to communicate technical concepts clearly and collaborate across cross-functional teams.
  • A related technical degree required.

Preferred Qualifications

  • Experience building semantic search indices, embedding pipelines, or retrieval-augmented generation (RAG) systems.
  • Familiarity with vector databases and AI/ML infrastructure.
  • Experience supporting or partnering with Decision Science or Data Science teams on causal or attribution modeling.
  • Track record of mentoring engineers and driving technical standards at a team or org level.

Skills & Requirements

Technical Skills

PythonSqlSparkAirflowDbtInformaticaEtlEltData modelingDatabase conceptsData warehousingCloud data platformsSalesforce ecosystemData cloudSemantic search indicesEmbedding pipelinesRetrieval-augmented generationVector databasesAi/ml infrastructureDecision scienceData scienceCausal modelingAttribution modelingAgentic retrievalAi orchestration layerAgentic data deliveryAgentic layerHigh-context dataAgentic systemsLlmsPrompt engineeringFine-tuning language modelsRag systemsEmbedding modelsNlp similarity/search systemsA/b testingBenchmarkingCi/cd deliveryGithubAdoKubernetesDockerModern methodologiesCommunicationCollaborationTechnical leadershipMentorshipTechnical standardsTechnical visionCross-functional impactTechnical excellenceContinuous growthTechnical pillarTeam buildingTechnical requirementsFast-paced researchProduct environmentScientific scrutinyScientific accuracyScientific rigorTechnical requirementsFast-paced researchProduct environmentScientific scrutinyScientific accuracyScientific rigorAwsCfaAiData engineeringData infrastructureAgentic aiCustomer successData deliveryAi orchestrationAgentic retrievalAgentic systemsLlmsPrompt engineeringFine-tuning language modelsRag systemsEmbedding modelsNlp similarity/search systemsA/b testingBenchmarkingCi/cd deliveryGithubAdoKubernetesDockerModern methodologiesBiotechPharmaComputational biologyBioinformaticsBiomedical workflowsBiomedical toolsBiomedical data integrationsBiomedical domainsGenomicsProteomicsDrug discoveryClinical dataAi/ml systemsBiomedical problemsBiomedical discovery

Level

Mid-Level

Posted

4/24/2026

Apply Now

You will be redirected to Salesforce, Inc.'s application portal.