Research Engineer, Agentic Retrieval (North America)

Qdrant

San Francisco, US

Job Description

Qdrant is an open-source vector search engine powering the next generation of AI applications, from semantic search and retrieval-augmented generation (RAG) to AI agents and real-time recommendations.

Trusted by global leaders like Canva, HubSpot, Tripadvisor, Bosch, and Deutsche Telekom, we’re building the retrieval infrastructure layer for modern AI. Recently raising $50M in Series B funding, we are growing rapidly and committed to transforming how AI understands and interacts with data.

As a remote-first company, we believe diverse backgrounds, perspectives, and experiences fuel innovation. Here, you’ll own meaningful work, tackle challenges, and grow alongside passionate individuals dedicated to shaping the future of AI.

We are looking for a Research Engineer, Agentic Retrieval. You'll work at the seam between agent systems research and retrieval engineering, running a tight loop between hypothesis, experiment, and shipped artifact.

The questions you'll chase may not have settled answers yet: how agents should structure memory, when they should re-query versus reason, how skills and tools should be retrieved and composed, what retrieval primitives the agent loop actually needs, and what "good" even means when success is a multi-step trajectory rather than a ranked list.

You'll go deep on how real agent stacks use Qdrant today, where the abstractions around them help or hurt, and what we should build (or change) so they can do more with less. The agent ecosystem moves fast, and part of the job is staying current with it without getting captured by it.

You'll have a lot of latitude to choose what to investigate. The bar is the same either way: every cycle should produce something the field, our customers, or the rest of the company can act on.

What you will own• Define what good agentic retrieval looks like. Characterize the retrieval patterns inside real agent loops, name the failure modes, and turn that vocabulary into something the team and the field can build against.

Treat agent memory as a systems problem. Episodic, semantic, and procedural memory each need different write paths, decay, and consolidation. Figure out which architectures hold up at scale and turn the durable patterns into reference implementations.

Investigate skill and tool retrieval as a first-class problem. How a skill registry should be indexed, how skills should be selected under tool budgets, and how retrieval should compose with planner decisions.

Design and run experiments on retrieval inside agent loops: query rewriting and decomposition, multi-hop retrieval, tool-conditioned filtering, retrieval-as-a-tool patterns, and the interplay between planner, retriever, and reranker.

Build evaluation infrastructure for agentic retrieval. Define metrics that correlate with end-to-end task success rather than recall@k, and build harnesses that catch regressions before they ship.

Profile agent retrieval traces end to end. Isolate where latency, cost, and quality losses come from across the fan-out of tool calls, and produce minimal reproductions when something looks like an engine-level issue.

Study how real agent stacks use Qdrant in production. Trace workloads, find where the surrounding abstractions leak performance or quality, and propose changes in Qdrant, in the stack, or in the recipe between them.

Pair with design-partner teams running serious agent workloads in production, and bring their real constraints back into research priorities.

Influence the roadmap. Translate evidence into product bets and argue for what should be a feature, a primitive, or a recipe.

Who you are• You read and reason about LLM behavior directly. You can distinguish prompt issues from planning issues from retrieval issues from tool design issues, and you've internalized how models actually use retrieved content versus ignore it.

You treat memory as a systems design problem. You distinguish episodic, semantic, and procedural memory, and you know naive "store every turn as a vector" approaches collapse fast.

You understand tool and skill systems as retrieval problems. You see tool selection and skill matching as ranking problems with their own quirks: tiny corpora, heavy metadata, strong priors, sensitivity to descriptions.

You have a working theory of context engineering. You think carefully about what goes into the context window and why, and you understand that retrieval quality and context construction are the same problem from two angles.

You build evals before features. You know how to construct task suites that actually discriminate between approaches, and how to avoid just on recall@k.

You know vector search internals at a decent level. HNSW tradeoffs, quantization, filtered search, multi-vector, hybrid retrieval, payload indexing. Enough to design agent patterns that exploit Qdrant's primitives instead of treating the database as a black box.

You write precisely. You can describe a memory architecture or

Skills & Requirements

Technical Skills

Vector searchAi agentsRetrieval-augmented generationRagReal-time recommendationsCommunicationAiRetrieval infrastructureSemantic search

Level

senior

Posted

5/8/2026

Apply Now

You will be redirected to Qdrant's application portal.

Find Similar Jobs

Browse roles in the same category, level, and remote setup.