Recruitment Consultant
Bliss Verna
Contact Details
bv@eu-recruit.com +44 (0) 3333 078 366
Posted
5 hours ago
We’re partnering with a venture-backed AI startup building next-generation visual conversational systems – enabling users to interact with AI through real-time video experiences that feel genuinely human. Their work sits at the intersection of multimodal ML, high-performance infrastructure, and real-time systems. They’re now hiring a Machine Learning Research Engineer to help bring cutting-edge models from research into highly optimized production environments.
This is a hands-on role focused on GPU performance, model acceleration, and scaling multimodal systems in real-world deployments.
Key Responsibilities:
•
Collaborate with research teams to productionize experimental models and transition them into reliable, scalable systems.
•
Own model performance optimization, profiling inference for latency and throughput improvements using techniques such as quantization, pruning, and distillation.
•
Debug and optimize GPU workloads, including CUDA-level performance issues.
•
Apply acceleration frameworks (e.g., TensorRT, ONNX, vLLM) to improve multimodal model efficiency across video, speech, and LLM systems.
•
Design and implement high-throughput data pipelines for large-scale video and multimedia datasets (petabyte-scale).
•
Develop evaluation frameworks to measure model quality and support continuous iteration.
•
Work closely with infrastructure teams to build scalable media processing and training workflows.
Key Qualifications:
•
2+ years of full-time experience in ML engineering, ideally in production ML environments.
•
Proven experience operationalizing research models into production systems with a focus on inference optimization (latency and throughput improvements).
•
Strong proficiency in PyTorch for training and deploying large-scale models.
•
Experience running models across large datasets for feature extraction or batch inference workloads.
•
Hands-on experience working with GPUs and debugging CUDA-related performance issues.
•
Experience with video, audio, or multimodal models preferred.
•
Background in a VC-backed startup or high-performing established company is advantageous.
•
Willingness to work onsite 5 days per week in Seattle.
•
Strong hands-on engineering focus (not a management-leaning or director-level profile).
•
Candidates whose ML experience is limited to fine-tuning models without deeper systems or performance work are unlikely to be a fit.
Industry
AI & Machine Learning
Contract Type
Permanent
Location
United States
City
Seattle
Work Model
On-Site
FULL TIME
mid
4/22/2026
You will be redirected to the job posting on Indeed.