Speech Research Scientist

Techire Ai

San Francisco, US

Hybrid

Job Description

Want to build the speech and audio models that define how the next generation of voice AI actually sounds and listens?

A well-funded AI startup has developed new model architectures that make real-time conversational AI finally viable at scale. While most voice AI still suffers from delays and computational bottlenecks, they've solved the core efficiency problems that have held the field back.

The role

As their Senior Research Scientist, you'll build core speech foundation models that could define the next decade of voice interaction. You'll work on novel architectures that have immediate real-world impact for thousands of customers.

What you'll do

•

Design and implement SOTA speech foundation models

•

Develop efficient algorithms for speech processing and audio understanding

•

Create scalable systems that handle massive audio workloads

•

Build comprehensive evaluation methods to validate model performance

•

Collaborate with engineering teams to transition research into production

What you'll bring

•

Deep expertise in modern speech technologies (TTS, Speech LLMs, Voice Conversion/Cloning, Speech Translation, ASR, Audio Understanding)

•

Strong background in generative modelling for audio and speech

•

Publications at leading conferences

•

Track record of implementing research ideas from concept to production

You'll join a solid research team, including technical founders who've published work that's fundamentally shifted how the field thinks about efficient, large-scale foundation models. They're well-funded and generating strong revenue. Comp is on par with top AI labs, with base over $400k+ DOE plus a generous equity package.

The role is based in San Francisco, hybrid with 4 days a week in the office.

If you're excited about building the foundational models that will power the next generation of voice AI, we'd love to hear from you.

All applicants will receive a response.

Skills & Requirements

Technical Skills

Speech foundation modelsSota speech technologiesTtsSpeech llmsVoice conversion/cloningSpeech translationAsrAudio understandingGenerative modellingAudio processingSpeech processingScalable systemsEvaluation methodsProduction transitionResearch implementationLeadershipCommunicationCollaborationTeamworkProblem-solvingCustomer-drivenResults-orientedOrganizedPersistentProactiveTeam playerAiVoice interactionAudio understandingSpeech processing

Salary

$400,000+

year

Employment Type

FULL TIME

Level

senior

Posted

5/2/2026

Apply Now

You will be redirected to Techire Ai's application portal.