Prompt Engineering & AI Evaluation Engineer

Nexwave

Sunnyvale, US

Why this role

Pace

Steady

Collaboration

Medium

Autonomy

Medium

Decision Impact

Team

Role Level

Individual Contributor

Derived from job-description analysis by Serendipath's career intelligence engine.

Skills & requirements

Required

Python

Stack & domain

PythonLlmsPrompt EngineeringAi Evaluation MethodologiesVersion Control SystemsCi/cd PipelinesCommunicationCollaborationProblem SolvingMentorshipAIEnterprise Ai ProductsSafetyResponsible Ai Practices

About the role

Original posting from Nexwave via LinkedIn

Role : Prompt Engineering & AI Evaluation Engineer
Location: Sunnyvale, CA (Preferred) | Austin, TX | Raleigh, NC ( Onsite )
Duration 12 Months
About the Role
We are looking for experienced AI Engineers with strong expertise in Prompt Engineering, LLM evaluation, and AI system optimization.
This role involves collaborating with product, research, and safety teams to improve AI model quality, reliability, and deployment readiness across enterprise AI products.
Key Responsibilities
Design, test, and optimize system prompts and feature-specific prompts for large language models (LLMs).
Build and maintain evaluation frameworks and testing suites to ensure AI model quality and consistency.
Collaborate closely with Product, Research, and Safety teams to validate new AI features and releases.
Support AI model launches by identifying regressions and ensuring smooth production rollouts.
Contribute to internal tooling and infrastructure for prompt development, evaluation, and experimentation.
Mentor engineering teams on prompt engineering best practices and evaluation methodologies.
Rapidly iterate and adapt to evolving AI model capabilities in a fast-paced environment.
Required Qualifications
5+ years of software engineering experience with Python or similar programming languages.
Hands-on experience with LLMs and Prompt Engineering in enterprise, research, or significant personal projects.
Strong understanding of AI evaluation methodologies, benchmarking, and quality metrics.
Excellent written and verbal communication skills with the ability to explain complex AI behaviors to diverse audiences.
Experience managing multiple concurrent projects in agile environments.
Strong knowledge of version control systems, CI/CD pipelines, and modern development practices.
Preferred Qualifications
Experience working with frontier AI models such as Claude, GPT, or similar production-grade LLMs.
Background in Machine Learning, NLP, or related AI disciplines.
Experience with A/B testing and experimentation platforms such as Statsig.
Familiarity with AI safety, alignment, and responsible AI practices.
Experience building tools and infrastructure for ML/AI workflows.
Proven ability to improve AI system performance through iterative evaluation and optimization.

Source: Nexwave careers (LinkedIn)