Senior AI Engineer

G2
Remote
Remote

Job Description

About G2 - The Company

G2 is the world's largest and most trusted software marketplace. When you join G2, you’re joining the industry’s leading team that helps businesses reach their peak potential by powering decisions and strategies with trusted insights from real software users.

Now, we have joined forces with Capterra https://www.capterra.com/, SoftwareAdvice https://www.softwareadvice.com/, and GetApp https://www.getapp.com/ to create the largest source of online data and software insights to fuel intelligent buying in the age of AI. With 200M+ combined annual visitors and 6M verified reviews, we are now the centralized place to enable software buyers to make better and faster decisions with confidence.

And we are just getting started! We are setting out to transform the global B2B software industry and become the most trusted data foundation for buyers and sellers of software for the age of AI.

Does that sound exciting to you? Come join us as we try to reach our next PEAK!

About G2 - Our People

At G2, everything we are and what we do is grounded in our PEAK values— (Performance + Entrepreneurship + Authenticity + Kindness. Working at G2 means you are part of a value-driven, growing global community that climbs PEAKs together. We cheer for each other’s successes, learn from our mistakes, and support and lean on one another during challenging times. With ambition and entrepreneurial spirit we push each other to take on challenging work, which will help us all to grow and learn.

You will be part of a global, diverse team of smart, dedicated, and kind individuals - each with unique talents, aspirations, and life experiences. At the heart of our community and culture are our people-led ERGs, which celebrate and highlight the diverse identities of our global team. As an organization, we are intentional about our DEI https://company.g2.com/dei and philanthropic work (like our G2 Gives https://company.g2.com/gives program) because it encourages us all to be better people.

ABOUT THE ROLE

G2 is looking for a Senior Full-Stack AI Engineer with strong production experience across modern web/backend systems and hands-on exposure to LLMs, Voice AI, and AI data platforms. You’ll lead end-to-end execution across the stack—designing reliable services, building real-time conversational experiences, and owning the data and evaluation foundations that turn large volumes of interview interactions into structured insights and continuously improving models.

This role is ideal for someone who can balance product velocity with engineering rigor, and who enjoys working across voice pipelines, retrieval/agent workflows, and data + evaluation systems to deliver measurable quality improvements over time.

IN THIS ROLE, YOU WILL: 

  • LMM/Agent Development: Prompting, RAG & Evaluation
  • Lead prompt design and iteration for summarisation, decision-making, multi-turn dialogue, agent behaviours, and tool/function calling.
  • Build and maintain evaluation harnesses (golden sets, rubrics, regression suites) to measure accuracy, consistency, safety, and usefulness across releases.
  • Implement and optimize RAG (Retrieval-Augmented Generation) workflows: chunking strategies, embeddings, retrieval/reranking, citations, and grounding techniques to reduce hallucinations.
  • Define strategies for knowledge freshness and context management across a project’s lifecycle (e.g., project-specific knowledge bases, interview-derived artifacts, evolving taxonomies).

VOICE AI: REAL-TIME CONVERSATIONAL SYSTEMS

  • Integrate and optimize components in AI-powered voice pipelines (STT, NLU, TTS, turn-taking, barge-in/interrupt handling, session state).
  • Improve multi-turn voice experience quality: latency, timing alignment, disfluency handling, and context retention.
  • Build voice simulation and test tooling to validate real-world and adversarial scenarios (noise, accents, interruptions, partial transcripts).
  • Partner with ML/Voice specialists to diagnose ASR misfires, timing mismatches, and agent/voice orchestration issues.

AI DATA PLATFORMS: ETL/ELT, INFORMATION EXTRACTION & REPORTING DATASETS

  • Design ingestion and transformation workflows for high-volume interview data (audio, transcripts, free-text responses, metadata, annotations).
  • Build ETL/ELT pipelines that validate/normalize inputs, run information extraction (entities, themes, taxonomy labeling, key moments), and produce curated, queryable reporting datasets.
  • Establish data models and schemas that preserve lineage from raw sources → intermediate artifacts → curated outputs → report-ready datasets.
  • Implement data quality practices: completeness/validity checks, sampling-based verification, reconciliation, and monitoring for drift.
  • Build mechanisms for traceability and auditability (e.g., linking report outputs back to transcript spans/timecodes, retrieval sources, and model/prompt versions).

CONTINUOUS IMPROVEMENT: FINE-TUNING, ADAPTATION & “LEARNING” OVER A PROJECT

  • Collaborate with ML/Data teams to support fine-tuning and/or model adaptation workflows (dataset curation, labeling guidelines, training/eval splits, offline evaluation, rollout validation).
  • Implement project-level feedback loops so the system improves as more interviews occur:
  • maintain evolving taxonomies and question strategies,
  • incorporate newly discovered concepts into retrieval stores,
  • update prompts/policies based on failure patterns,
  • expand evaluation sets automatically with new edge cases.
  • Build mechanisms for “real-time” or iterative learning without sacrificing safety (e.g., controlled updates to RAG indexes, prompt/version rollouts, gated releases, human review where needed).
  • Enable the agent to ask more intelligent follow-up questions by using accumulated project knowledge (grounded in retrieved evidence and governed by safety policies).

BACKEND ENGINEERING: ARCHITECTURE, RELIABILITY & OBSERVABILITY

  • Own architecture and implementation of backend services and workflows supporting LLM/voice/data experiences (APIs, orchestration, storage, queues).
  • Improve system resilience through observability, tracing, structured logging, rate limiting, fallbacks, and failure-mode design.
  • Lead debugging and resolution of complex issues across LLM pipelines, retrieval systems, data workflows, and conversational agent logic.
  • Build internal tools to accelerate diagnosis, QA, and safe experimentation.

AUTOMATED TESTING, SECURITY & QUALITY ENGINEERING

  • Design and maintain automated test suites for APIs, pipelines, RAG systems, and LLM outputs (regression, reliability, performance, load).
  • Use LLMs to generate synthetic datasets for robust coverage across realistic and adversarial conditions.
  • Establish quality gates in CI/CD (eval thresholds, golden tests, contract tests) to ensure safe deployments.
  • Proactively identify and mitigate threats such as prompt injection, data leakage, and abuse scenarios.

TECHNICAL LEADERSHIP & COLLABORATION

  • Lead projects end-to-end: requirements shaping, technical design, implementation, rollout, monitoring, and iteration.
  • Mentor engineers through code reviews, pairing, design guidance, and raising engineering standards.
  • Communicate tradeoffs clearly with stakeholders; influence roadmap decisions through technical insight.
  • Contribute to documentation, runbooks, and best practices for production-grade AI systems.

MINIMUM QUALIFICATIONS:

We realize applying for jobs can feel daunting at times. Even if you don’t check all the boxes in the job description, we encourage you to apply anyway. 

  • Required
  • 5–8+ years of professional software engineering experience (full-stack, backend, platform, or data-adjacent systems).
  • Hands-on experience with LLMs (e.g., OpenAI, Anthropic/Claude, Mistral, etc.), including prompt design and evaluation.
  • Experience implementing or o

Skills & Requirements

Technical Skills

LlmsVoice aiAi data platformsPrompt designEvaluationRagRetrieval/rerankingCitationsGrounding techniquesKnowledge freshnessContext managementLeadershipCollaborationCommunicationProblem-solvingTechnical expertiseAiLlmsVoice aiAi data platforms

Domain Knowledge

AISoftware

Employment Type

FULL TIME

Level

senior

Posted

1/28/2026

Continue to Ashby

You will be redirected to the job posting on Ashby.

Sign in and we'll score your resume against this role.