Speech Research Scientist

Techire Ai
San Francisco, US
Hybrid

Job Description

Job Description

Want to build the speech and audio models that define how the next generation of voice AI actually sounds and listens?

A well-funded AI startup has developed new model architectures that make real-time conversational AI finally viable at scale. While most voice AI still suffers from delays and computational bottlenecks, they've solved the core efficiency problems that have held the field back.

The role

As their Senior Research Scientist, you'll build core speech foundation models that could define the next decade of voice interaction. You'll work on novel architectures that have immediate real-world impact for thousands of customers.

What you'll do

Design and implement SOTA speech foundation models

Develop efficient algorithms for speech processing and audio understanding

Create scalable systems that handle massive audio workloads

Build comprehensive evaluation methods to validate model performance

Collaborate with engineering teams to transition research into production

What you'll bring

Deep expertise in modern speech technologies (TTS, Speech LLMs, Voice Conversion/Cloning, Speech Translation, ASR, Audio Understanding)

Strong background in generative modelling for audio and speech

Publications at leading conferences

Track record of implementing research ideas from concept to production

You'll join a solid research team, including technical founders who've published work that's fundamentally shifted how the field thinks about efficient, large-scale foundation models. They're well-funded and generating strong revenue. Comp is on par with top AI labs, with base over $400k+ DOE plus a generous equity package.

The role is based in San Francisco, hybrid with 4 days a week in the office.

If you're excited about building the foundational models that will power the next generation of voice AI, we'd love to hear from you.

All applicants will receive a response.

Skills & Requirements

Technical Skills

Speech foundation modelsSota speech technologiesTtsSpeech llmsVoice conversion/cloningSpeech translationAsrAudio understandingGenerative modellingAudio processingSpeech processingScalable systemsEvaluation methodsProduction transitionResearch implementationLeadershipCommunicationCollaborationTeamworkProblem-solvingCustomer-drivenResults-orientedOrganizedPersistentProactiveTeam playerAiVoice interactionAudio understandingSpeech processing

Salary

$400,000+

year

Employment Type

FULL TIME

Level

senior

Posted

5/2/2026

Apply Now

You will be redirected to Techire Ai's application portal.

Sign in and we'll score your resume against this role.