Principal AI Research Scientist

Cerebro
San Francisco, US

Job Description

We are an innovative AI startup developing a revolutionary infrastructure layer for AI systems that allows models and agents to interact with critical real-world data accurately.

In today's landscape, over 80% of enterprise data is unstructured and multimodal—including documents, tables, charts, and images—yet current systems (such as LLMs, OCR, RAG pipelines) struggle to process these effectively.

Our mission is to address this gap by creating cutting-edge vision-language models and systems that convert intricate, real-world inputs into structured, machine-readable formats for downstream models to use efficiently.

Our solutions are currently deployed in high-stakes fields (finance, legal, healthcare) where accuracy is critical, and they are outperforming existing solutions on real-world benchmarks.

We view this task—transforming complex realities into understandable formats for AI systems—as one of the most pressing unsolved challenges in our space.

The Role

We are looking for a Founding ML Researcher to collaborate with our founders in shaping and executing our core research agenda.

This position offers a unique opportunity to work at the nexus of:

  • Multimodal foundation models
  • Vision-language reasoning
  • Structured generation and parsing
  • Reliability and determinism in AI systems

You'll enjoy complete ownership over the research lifecycle—from first-principles design and model building to training, production, and deployment.

What You'll Work On

  • Creating innovative architectures for multimodal understanding (documents, tables, layouts, graphs)
  • Advancing beyond traditional LLM approaches into structure-aware and layout-aware models
  • Enhancing factual accuracy, determinism, and reliability of model outputs
  • Integrating systems that combine:
  • Vision models
  • Language models
  • Structured decoding and constrained generation
  • Establishing evaluation frameworks for real-world correctness (not just benchmark results)
  • Directly implementing research into production systems that serve real customers

Who This Is For

We are particularly interested in candidates who are currently part of (or competitive with):

  • Leading labs (e.g., Anthropic, OpenAI, DeepMind, Meta, etc.)
  • Premier research teams or high-caliber startups engaged in foundational ML work

You should possess:

  • Strong expertise in deep learning and ML research
  • Experience with at least one of the following:
  • Multimodal models (VLMs, vision transformers, etc.)
  • LLMs / generative models
  • Representation learning or structured prediction
  • A demonstrated history of building or deploying real systems, not solely academic publications
  • A preference for first-principles thinking over incremental advancements

What Makes This Different

  • Greenfield research opportunities: You will help shape key parts of the roadmap
  • Fast feedback loops: Your work will be implemented quickly in production
  • Challenging, unsolved problems:
  • Transforming perception into structured reasoning
  • Connecting vision, language, and symbolic structure
  • Ensuring the reliability of AI systems in real-world conditions
  • Intimate team environment: Collaborate closely with highly skilled founders
  • Broad impact potential: This area is crucial to emerging fields like RAG, AI agents, and enterprise applications

Why Join Us

While many frontier labs focus on scaling general models, we are dedicated to solving an equally critical challenge:

Ensuring that models perform well with real-world data.

Addressing this issue could pave the way for:

  • Reliable AI agents
  • Scalable automation solutions
  • New categories of specialized AI applications

This is your chance to own a foundational component of the AI ecosystem right from the start.

Skills & Requirements

Technical Skills

Multimodal foundation modelsVision-language reasoningStructured generation and parsingReliability and determinism in ai systemsVision modelsLanguage modelsStructured decoding and constrained generationEvaluation frameworksReal-world correctnessFirst-principles thinkingAiEnterprise dataFinanceLegalHealthcare

Level

senior

Posted

4/28/2026

Apply Now

You will be redirected to Cerebro's application portal.

Sign in and we'll score your resume against this role.