Senior AI Interaction Evaluator (Codex / Claude Code)
Contract | $100-$200/hour | 10-20 hrs/week | Start ASAP (through early May)
Check out this Loom video for more details!
We're looking for highly experienced software engineer (SR+) to help evaluate the quality of interactions with modern coding agents such as OpenAI Codex and Claude Code.
This is not a traditional engineering role.
You won't be writing production code.
You'll be evaluating something harder: whether the model thinks like a great engineer.
What This Role Actually IsYou will assess how AI coding agents behave in real-world scenarios - focusing on:
This role is about engineering taste - not syntax correctness.
What You'll Be Doing• Evaluate AI-generated coding interactions end-to-end
What We Mean by "Taste"We're specifically looking for engineers who can answer questions like:
You should be comfortable making subjective but rigorous judgments.
Who You Are• Staff / Principal-level engineer (or equivalent experience)
Nice to Have• Experience with tools like Cursor or similar AI-first IDEs
Engagement Details• Rate: $100-$200/hour
$100 - $200
hour
CONTRACT
senior
5/6/2026
You will be redirected to G2i Inc.'s application portal.
Sign in and we'll score your resume against this role.
Browse roles in the same category, level, and remote setup.
Sign in to open the target role workbench.