Lead Applied Scientist, NLP/GenAI

Novaedge
Washington, US
Remote

Job Description

This a Full Remote job, the offer is available from: United States, Canada New Position: This position is open due to an existing vacancy to support our evolving business needs. Lead Applied Scientist, Document Understanding Document understanding is a foundational intelligence layer that powers every major capability across our legal AI platform—from search and information extraction to agentic reasoning in products like Westlaw, PracticalLaw, and CoCounsel. You'll build state-of-the-art semantic chunking, document enrichment, and knowledge graph construction systems that serve as the cognitive foundation multiple product teams depend on, working across authoritative legal, tax, and accounting content and extraordinarily diverse customer data. This is a rare opportunity to solve publishing-quality research problems with immediate production impact—your innovations will directly shape how millions of legal professionals research, analyze, and reason over complex legal documents while advancing the capabilities that enable the next generation of intelligent legal AI agents. About the Role As a Lead Applied Scientist, you will: Innovate & Deliver at Scale • Lead the design, build, test, and deployment of end-to-end AI solutions for complex document understanding tasks in the legal domain • Direct the execution of large-scale projects including: advanced semantic chunking models for lengthy, non-uniformly structured legal documents with adjustable granularity; document enrichment systems with legal and customer-defined taxonomies; LLM-based knowledge graph construction pipelines that extract and link heterogeneous legal knowledge; and scalable synthetic data generation systems • Serve as the technical lead and primary point of reference, ensuring full accountability for all research deliverables • Partner with engineering to guarantee well-managed software delivery and reliability at scale across multiple product lines Evaluate, Optimize & Advance Capabilities • Design comprehensive evaluation strategies for both component-level and end-to-end quality, leveraging expert annotation and synthetic data • Apply robust training methodologies that balance performance with latency requirements • Lead knowledge distillation initiatives to compress large models into production-ready SLMs • Maintain scientific and technical expertise through product deliverables, published research, and intellectual property contributions • Inform Labs shared capabilities and research themes through novel approaches to challenging business problems Drive Strategic Technical Direction • Independently determine appropriate architectures for complex document understanding challenges, balancing accuracy, efficiency, and scalability • Make critical technical decisions on semantic chunking strategies, document classification approaches, LLM-based knowledge extraction methods, and multi-document reasoning architectures • Provide input to business stakeholders, mid-to-senior level leadership, and Labs leadership on long-term AI strategy • Develop in-depth knowledge of TR customers and data infrastructure across multiple products to shape technical roadmaps Align, Communicate & Lead • Partner closely with Engineering and Product teams to translate complex legal document understanding challenges into scalable, production-ready solutions • Engage stakeholders across multiple product lines to deeply understand use case requirements, shaping objectives that align document understanding capabilities with diverse business needs including next-generation search and deep legal research • Mentor and coach team members with varied ML/NLP abilities, building technical capability across the organization About You Required Qualifications • PhD in Computer Science, AI, NLP, or a related field, or a Master's degree with equivalent research/industry experience • 7+ years of hands-on experience building and deploying document understanding systems, information extraction pipelines, or knowledge graph construction using deep learning, LLMs, and NLP methods • Proven ability to translate complex document understanding problems into innovative AI applications that balance accuracy and efficiency • Demonstrated ability to provide technical leadership, mentor team members, and influence without formal authority in an applied research setting • Strong programming skills (e.g., Python) and experience with modern deep learning frameworks (e.g., PyTorch, Hugging Face Transformers, DeepSpeed) • Publications at relevant venues such as ACL, EMNLP, ICLR, NeurIPS, SIGIR, or KDD Technical Qualifications • Deep understanding of document understanding fundamentals: document layout analysis, semantic chunking approaches beyond fixed-size or paragraph-based methods, document classification handling hierarchical taxonomies, imbalanced multi-label classification, and adapting to domain-specific schemas • Expertise in knowledge extraction and knowledge graph construction: entity recog

Skills & Requirements

Technical Skills

Ai solutionsDocument understandingSemantic chunkingDocument enrichmentKnowledge graph constructionSynthetic data generationAdvanced semantic chunking modelsDocument enrichment systemsLlm-based knowledge graph construction pipelinesScalable synthetic data generation systemsDeepspeedAclEmnlpIclrNeuripsSigirKddNlpGenaiLegal aiDocument understanding

Salary

$147,600+

year

Employment Type

FULL TIME

Level

lead

Posted

4/14/2026

Apply Now

You will be redirected to Novaedge's application portal.