This role is part of the Jobright Direct Hiring Network, where top companies like Cresta AI, Plaud, Mercor, OpenArt, and 1,500+ others hire top talent directly through our platform.
This is not a mass job posting. Only select, high-signal candidates are invited and recommended directly to hiring teams
Hiring Company: Humoniq (YC S25)
One-liner: Humoniq is a YC-backed startup focused on building integrated AI systems to solve real-world problems in travel and transport.
Why Join Us:
Role Responsibilities
- Build a log ingestion pipeline
- Ingest GCP Cloud Run / application logs into a central store (BigQuery / Postgres)
- Parse logs into ticket-level and message-level records
- Join in evaluator comments and metadata so we can analyze behavior end-to-end
- Ship an AI regression and evaluations
- Re-run historical conversations through new prompts / models
- Compare End-of-Conversation classification/Issue/Task action-plan outputs over time
- Generate clear reports that show regressions, hallucinations, and wins
- Improve our AI agents through prompting and other changes
- Implement drift detection
- Track distributions of intents, outcomes, and actions over time
- Detect when user behavior or model outputs deviate from baseline
- Surface drift in dashboards and alerts so we can act before customers are hurt
- Build internal dashboards & tools
- Let evaluators and product see problem tickets quickly
- Make it trivial to search for 'all conversations where X went wrong'
- Visualize trends so we stop arguing anecdotes and start arguing data
- Own reliability + documentation
- Add monitoring and alerting around your pipelines
- Document your data models, assumptions, and runbooks
- Make it possible for someone new to pick up your work and move forward
Qualitications
Required
- Explicit and demonstrable experience in backend, data engineering, or ML infra (or equivalent real-world work)
- Strong Python skills for scripting and small services
- Experience with at least one cloud platform (GCP ideal)
- Experience building and operating ETL / data pipelines in production
- Comfort with SQL and analytical databases (BigQuery, Snowflake, Redshift, or similar)
- Clear written communication and willingness to document decisions
Preferred
- Experience with GCP Cloud Run / Cloud Logging / Pub/Sub / Cloud Scheduler
- Data orchestration tools (Airflow, Dagster, Prefect, dbt, etc.)
- Experience with observability stacks (Grafana, Prometheus, OpenTelemetry, etc.)
- Familiarity with LLMs, prompt evaluation, or ML monitoring