Department: AI Agent Research Center
Location: Hong Kong / Shenzhen
Experience: Graduate / Early Career
Openings: 10
About the Role:
You will help build learning-capable AI agents that interact with real-world business environments, learn decision policies for pricing/inventory, and optimize behavior through feedback. This is about RL + LLM + Multi-agent coordination in real industrial systems.
Key Focus:
•
Design agent-environment interaction systems (observations, actions, rewards).
•
Apply RL to pricing optimization, inventory allocation, and fulfillment scheduling.
•
Build long-horizon planning and multi-step reasoning pipelines.
•
Implement preference learning and feedback optimization (RLHF / RLAIF).
•
Construct simulation environments and offline evaluation pipelines from real business data.
Ideal Experience:
•
Background in RL, agents, or decision systems; Strong Python & PyTorch.
•
Ability to abstract real-world problems into states, actions, and rewards.
•
Nice to have: Multi-agent experience, Game theory, or Supply chain optimization.
•
Tech Stack: Python, PyTorch, Distributed RL, Agent frameworks.
Please send English CV to : [email protected]
junior
4/19/2026
You will be redirected to the job posting on Indeed.