Senior Applied Scientist, Document Understanding

Thomson Reuters
Toronto; Ontario, CA; US

Job Description

New Position: This position is open due to an existing vacancy to support our evolving business needs.

Senior Applied Scientist, Document Understanding

About The Role

This is an applied science position focused on designing, building, and deploying production-grade document understanding systems that power Westlaw, PracticalLaw, and CoCounsel.

You will work across semantic chunking, document enrichment, and knowledge graph construction for complex legal, tax, and accounting content — delivering foundational intelligence that multiple product teams depend on at scale.

About You

You hold a PhD or Master's in Computer Science, AI, NLP, or a related field, with 5+ years of post-degree industry experience shipping document understanding, information extraction, or knowledge graph systems into production. You have hands-on depth across model development, distillation, evaluation, and deployment. You work independently, lead through influence in an applied research setting, and measure success by what ships and performs in production.

What You'll Do

  • Design and deploy semantic chunking models for lengthy, non-uniformly structured legal documents with adjustable granularity across use cases
  • Build document enrichment systems that classify documents according to legal and customer-defined taxonomies and extract rich metadata
  • Develop LLM-based knowledge graph construction pipelines that extract and link citations, entities, and legal concepts across diverse legal content
  • Build scalable synthetic data generation systems for model training, multi-hop query simulation, and hallucination-free answer generation
  • Apply knowledge distillation techniques to compress large models into latency-constrained, production-ready SLMs
  • Design evaluation frameworks — component-level and end-to-end — using expert annotation and synthetic data
  • Drive independent technical decisions on chunking strategy, classification approach, knowledge extraction methods, and multi-document reasoning architecture
  • Partner with engineering on delivery, reliability, and scale across multiple product lines
  • Contribute to published research at venues such as ACL, EMNLP, ICLR, NeurIPS, SIGIR, and KDD, and to intellectual property

Required Qualifications

  • PhD or Master's in Computer Science, AI, NLP, or a related field
  • 5+ years of post-degree industry experience shipping document understanding, information extraction, or knowledge graph systems into production — not research-only experience
  • Publications at ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD, or equivalent
  • Experience leading through influence in an applied research setting
  • Production Python and experience with PyTorch, Hugging Face Transformers, and DeepSpeed

Hands-on Production Depth Required In

  • Document layout analysis and semantic chunking beyond fixed-size or paragraph-based methods
  • Hierarchical, multi-label document classification with domain-specific and customer-defined schemas
  • Entity recognition and linking, relation extraction, citation parsing, and knowledge graph construction from unstructured text
  • LLM-based information extraction, few-shot and multi-task learning, and post-training
  • Knowledge distillation, model compression, and SLM deployment under latency constraints
  • Synthetic data generation for NLP: query-answer generation with verification and scalable data augmentation
  • Annotation workflow design and evaluation framework development for document understanding tasks

Preferred Qualifications

  • Legal document understanding, legal information extraction, or legal AI applications
  • Complex document structures common in legal content: nested hierarchies, cross-references, non-uniform formatting, and embedded elements
  • Retrieval, QA, or analysis systems over large document collections
  • Knowledge graph frameworks for legal or enterprise applications
  • RAG and agentic workflows for enterprise knowledge systems
  • AzureML or AWS SageMaker

What’s in it For You?

  • Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, empowering employees to achieve a better work-life balance.
  • Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow’s challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.
  • Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for

Skills & Requirements

Technical Skills

PythonPytorchHugging face transformersDeepspeedDocument layout analysisSemantic chunkingDocument enrichmentKnowledge graph constructionEntity recognitionRelation extractionCitation parsingLlm-based information extractionFew-shot and multi-task learningKnowledge distillationModel compressionSlm deploymentSynthetic data generationAnnotation workflow designEvaluation framework developmentAiNlpDocument understandingInformation extractionKnowledge graphLegal contentTax contentAccounting contentSemantic chunkingDocument enrichmentKnowledge graph constructionEntity recognitionRelation extractionCitation parsingLlm-based information extractionFew-shot and multi-task learningKnowledge distillationModel compressionSlm deploymentSynthetic data generationAnnotation workflow designEvaluation framework development

Domain Knowledge

LegalTaxAccounting

Salary

£150,000 - £250,000

year

Employment Type

FULL TIME

Level

senior

Posted

3/28/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.

Sign in and we'll score your resume against this role.