Applied Scientist - Multimodal Guardrails & Evaluation

Adobe
Seattle, US

Job Description

The success of generative AI guardrails is notdeterminedsolely by modelarchitecture;it is defined by the quality of the concepts we protect, the robustness of our detection science, and the realism of our evaluationmethodology.

The Adobe Firefly Applied Science & Machine Learning team is expanding its IP guardrail systemsfor buildingsafe andcompliantimage, video, and audio generative models. A central challenge is long-tail IP coverage: foundation models cannot internalize every protected concept. Advancing our guardrails therefore requires principled approaches to multimodal knowledge integration, detection modeling, and real-world evaluation science.

We are seeking aP40 Applied Scientistto drive the scientific development of IP concept modeling, detection strategies, and evaluation frameworks thatdeterminethe robustness ceiling of our guardrail systems. This is a research-oriented role focused on advancing multimodal detection and evaluation methodologies atproductionscale.

Research Areas You Will Drive

Multimodal Concept Modeling & Knowledge Integration

  • Develop principled methods forrepresentingand organizing large-scale IP concept spaces across text, image, and audio.
  • Study how retrieval-augmented systems (RAG), embedding alignment, and structured knowledge can complement multimodal foundation models.
  • Investigate strategies for improving long-tail concept coverage beyond what VLMs inherently encode.
  • Design concept modeling techniques that meaningfully influence downstream guardrail decisions.

Data Acquisition & Curation for Detection

  • Develop scalable approaches foracquiringand curating high-quality multimodal datasets that improve detection coverage for IP-sensitive concepts.
  • Drive long-tailexpansion through targeted data collection, web-scale sourcing, and synthetic data generation.
  • Analyze failure cases in generative outputs to inform targeted data acquisition and dataset refinement.
  • Design efficient data curation and labeling strategies that improve signal quality and robustness of downstream detection systems.

Evaluation Methodology & Benchmark Design

  • Define evaluation frameworks that reflect real-world Firefly usage patterns.
  • Design multimodal benchmark datasets that stress-test guardrails under realistic and adversarial scenarios.
  • Develop metrics that capture over-blocking, under-blocking, semantic similarity, and near-miss generation.
  • Establish statistically rigorous offline and online evaluation strategies that guide research prioritization.
  • Study howevaluationquality constrains and enables system-level progress.

Scientific Iteration Guided by Input

  • Leverage large-scale product feedback signals to identify systematic weaknesses in guardrail behavior.
  • Translate real-world interaction patterns into structured evaluation hypotheses.
  • Build reproducible experimental pipelines that enable continuous scientific iteration.

WhatYou'llDo

  • Lead scientific design of IP concept expansion and detection methodologies.
  • Formulate hypotheses around long-tail coverage,detectionrobustness, and evaluation gaps.
  • Run rigorous experiments to quantify performance ceilings andidentifyhigh-leverage improvements.
  • Partner closely with generative model scientists to ensure alignment between detection, guidance, and evaluation systems.
  • Contribute to intellectual property and potential publications in multimodal learning, evaluation science, or AI safety.

WhatYou'llNeed to Succeed

Research & Technical Depth

  • PhD or MS in Computer Science, Machine Learning, AI, or related field.
  • 5+ years of experience in applied ML, multimodal systems, or evaluation research.
  • Strong understanding of Vision-Language Models, multimodal transformers, and embedding-based retrieval systems.
  • Experience designing and analyzing large-scale benchmarks and evaluation datasets.
  • Solid background in statistical analysis, experimental design, and performance trade-off evaluation.
  • Proficiencyin Python and modern ML frameworks (e.g.,PyTorch).

Scientific Approach

  • Ability to reasonaboutlong-tail distributions and concept coverage gaps.
  • Experience analyzing complex multimodal system failure modes.
  • Strong intuition formeasurement ofquality and evaluation bias.
  • Comfortoperatingin ambiguous, research-driven problem spaces.

AI-Accelerated Research Execution

  • Demonstrated ability touseAI coding tools and AI-assisted workflows to rapidly prototype evaluation frameworks, detection experiments, and data analysis.
  • Ability to scale scientific insight through high-velocity experimentation.

Preferred Qualifications

  • Experience in safety evaluation, trust & safety systems, or content moderation science.

#FireflyGenAI

About Adobe

Adobe empowers everyone to create through innovative platforms and tools that unleash creativity, productivity and personalized customer experiences. Adobe's industry-leading offerings including Adobe Acrobat Studio, Adobe Express, Adobe Firefly, Creative Cloud, Adobe Expe

Skills & Requirements

Technical Skills

Multimodal Concept ModelingKnowledge IntegrationData AcquisitionCurationEvaluation MethodologyBenchmark DesignSafety EvaluationTrust & Safety SystemsContent Moderation ScienceCommunicationAI SafetyMachine Learning

Level

senior

Posted

3/18/2026

Apply Now

You will be redirected to Adobe's application portal.