Lead Applied Scientist, Document Understanding

THOMSON REUTERS
Toronto; Ontario, CA; US
On-site

Job Description

New Position: This position is open due to an existing vacancy to support our evolving business needs.

Lead Applied Scientist, Document Understanding

About the Role

This role sits within the applied science function. You will own the design, development, and production deployment of document understanding systems that directly power Westlaw, PracticalLaw, and CoCounsel. The problems are real, the scale is large, and the expectation is shipped, reliable, measurable impact.

You will work across semantic chunking, document enrichment, knowledge graph construction, and synthetic data generation for complex legal, tax, and accounting content. Multiple product teams depend on what this function delivers.

About You

You hold a PhD in Computer Science, AI, NLP, or a related field, with 8+ years of post-degree industry experience taking NLP and document understanding systems from development to production at scale. You have hands-on depth across the full applied arc - model development, distillation, evaluation, and deployment. You publish, you mentor, and you measure success by what ships and performs in production.

What You'll Do

  • Design and deploy semantic chunking models for lengthy, non-uniformly structured legal documents with adjustable granularity across use cases
  • Build document enrichment systems using legal and customer-defined taxonomies
  • Develop LLM-based knowledge graph construction pipelines that extract and link citations, entities, and legal concepts across diverse legal content
  • Lead knowledge distillation efforts to compress large models into latency-constrained, production-ready SLMs
  • Design evaluation frameworks - component-level and end-to-end - using expert annotation and synthetic data
  • Own technical decisions on architecture, chunking strategy, classification approach, and knowledge extraction methods
  • Partner with engineering on delivery, reliability, and scale across multiple product lines
  • Provide technical input to senior leadership on AI strategy and roadmap
  • Mentor applied scientists and ML practitioners on the team

Required Qualifications

  • PhD in Computer Science, AI, NLP, or a related field - required
  • 8+ years of post-degree industry experience shipping document understanding, information extraction, or knowledge graph systems into production - not research-only experience
  • Publications at ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD, or equivalent
  • Production Python and experience with PyTorch, Hugging Face Transformers, and DeepSpeed

Hands-on production depth required in:

  • Document layout analysis and semantic chunking beyond fixed-size or paragraph-based methods
  • Hierarchical, multi-label document classification with domain-specific and customer-defined schemas
  • Entity recognition and linking, relation extraction, citation parsing, and knowledge graph construction from unstructured text
  • LLM-based information extraction, few-shot and multi-task learning, and post-training
  • Knowledge distillation, model compression, and SLM deployment under latency constraints
  • Synthetic data generation and annotation workflow design
  • End-to-end evaluation framework design for document understanding

Preferred Qualifications

  • Legal document understanding, legal IE, or legal AI experience
  • Complex document structures: nested hierarchies, cross-references, non-uniform formatting
  • Retrieval or QA systems over large document collections
  • RAG and agentic workflows in enterprise settings
  • Knowledge graph frameworks for legal or enterprise applications
  • AzureML or AWS SageMaker

#LI-LP2

What's in it For You?

  • Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, empowering employees to achieve a better work-life balance.
  • Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow's challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.
  • Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.
  • Culture: Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more. We live by our values: Obsess over our Customers, Compete to Win, Challenge (Y)our Thinking, Act Fast / Learn Fast, and Stronger Together.
  • Social Impact: Make an impact in your commun

Skills & Requirements

Technical Skills

Semantic chunkingDocument enrichmentKnowledge graph constructionLlm-based information extractionKnowledge distillationSynthetic data generationEvaluation framework designMentoringTechnical inputCollaborationDocument understandingAiNlp

Soft Skills

Technical leadershipMentorshipCollaboration

Domain Knowledge

Document understandingNLPAI

Employment Type

FULL TIME

Level

lead

Posted

4/13/2026

Apply Now

You will be redirected to THOMSON REUTERS's application portal.