AI​/ML Engineer, MyHealthTeam

Swoop
San Francisco, US
On-site

Job Description

Position: Staff AI/ML Engineer, MyHealthTeam

Join us to redefine the patient experience

My Health Team builds communities for people living with chronic and rare conditions. We reach millions of people each month, and we're investing deeply in AI to help members find the right support, content, and recommendations - safely, responsibly, and at scale.

We're looking for a Staff AI/ML Engineer who is product-minded, hands-on, and excited to ship production ML/LLM systems that turn messy, real-world text and behavioral data into reliable experiences for patients.

What you get to do every day

  • Build end-to-end ML/LLM features from problem definition → data → modeling → evaluation → deployment → monitoring.
  • Develop LLM applications with retrieval and tool use (e.g., RAG, orchestration/workflows, structured extraction) to deliver trustworthy consumer health experiences.
  • Convert unstructured text (posts, comments, messages, search queries) into structured signals (topics, entities, intent, sentiment, safety flags) using a mix of classical NLP and modern LLMs.
  • Create and maintain data pipelines for training, inference, evaluation, and analytics (batch and/or streaming as needed).
  • Design evaluation systems that measure quality and safety: offline metrics, golden datasets, human review workflows, and online A/B testing alignment.
  • Implement production guardrails to reduce harm and misinformation risk (policy constraints, refusal behavior, citations/attribution when appropriate, red-teaming, monitoring, and incident response).
  • Set up monitoring for model + system health (latency, cost, drift, regressions, quality metrics).
  • Partner closely with the Product, Engineering, and Data teams and clinical/subject-matter experts to validate outputs and define what "correct" means for sensitive, health-adjacent use cases.
  • (Staff scope) Lead architecture and technical direction for applied AI across the organization; mentor engineers; establish best practices and reusable platforms.

Examples of problems you might work on

  • Personalized recommendations for communities, posts, resources, or next-best actions
  • Safer content understanding: detection of misleading/high-risk health claims, escalation workflows
  • Search and discovery improvements using embeddings, hybrid retrieval, and ranking
  • Summarization and structuring of long threads into navigable insights (with safety constraints)
  • Member intent understanding from behavioral + text signals

Must-have qualifications

  • 8+ years building and shipping production ML systems (or equivalent experience with demonstrable impact)
  • Strong Python skills and experience with ML/LLM libraries and tooling (e.g., Hugging Face ecosystem, Lang Chain/Lang Graph, or equivalent)
  • Proven ability to design production-grade pipelines (training/inference/eval) and operate models in real systems (monitoring, rollbacks, incident handling)
  • Solid grounding in ML fundamentals (NLP, deep learning, statistical reasoning, evaluation)
  • Experience with MLOps best practices: versioning, reproducibility, CI/CD, model registry patterns, feature/data management, and infrastructure collaboration
  • Experience working with large-scale data using Databricks/Spark or equivalent distributed processing
  • Strong product and stakeholder instincts: you can translate ambiguous business needs into measurable ML outcomes

Nice-to-have qualifications

  • Experience building RAG and retrieval systems: vector databases, hybrid search, ranking, recommendation, query understanding
  • Experience in healthcare or regulated environments, including privacy-by-design, auditability, and safety reviews (HIPAA/PHI familiarity a plus)
  • Experience with streaming/clickstream data, experimentation platforms, and causal/measurement thinking
  • Ability to prototype end-to-end experiences (e.g., Streamlit, Gradio, lightweight frontends)
  • Experience designing LLM safety systems: red-teaming, adversarial testing, prompt injection mitigation, output filtering, human-in-the-loop review

Some tools we use

  • Databricks/Spark for distributed processing
  • Redshift and BI tools (Looker/Tableau) for analytics
  • Terraform for infrastructure-as-code;

Airflow for orchestration;

Git Hub Actions for CI/CD

  • AWS (including Bedrock) and a mix of private and open-weight…

Skills & Requirements

Technical Skills

PythonHugging face ecosystemLang chain/lang graphDatabricks/sparkRedshiftLooker/tableauTerraformAirflowGithub actionsAws bedrockLeadershipCommunicationStakeholder managementHealthcare

Employment Type

FULL TIME

Level

mid

Posted

4/15/2026

Apply Now

You will be redirected to Swoop's application portal.