Senior AI Interaction Evaluator (Codex / Claude Code)
Contract | $100-$200/hour | 10-20 hrs/week | Start ASAP (through early May)
Check out this Loom video for more details!
We're looking for highly experienced software engineer (SR+) to help evaluate the quality of interactions with modern coding agents such as OpenAI Codex and Claude Code.
This is not a traditional engineering role.
You won't be writing production code.
You'll be evaluating something harder: whether the model thinks like a great engineer.
What This Role Actually IsYou will assess how AI coding agents behave in real-world scenarios - focusing on:
This role is about engineering taste - not syntax correctness.
What You'll Be Doing• Evaluate AI-generated coding interactions end-to-end
What We Mean by "Taste"We're specifically looking for engineers who can answer questions like:
You should be comfortable making subjective but rigorous judgments.
Who You Are• Staff / Principal-level engineer (or equivalent experience)
Nice to Have• Experience with tools like Cursor or similar AI-first IDEs
Engagement Details• Rate: $100-$200/hour
$100 - $200
hour
CONTRACT
principal
4/27/2026
You will be redirected to G2i Inc.'s application portal.
Sign in and we'll score your resume against this role.