Job Overview
We are building a financial data and AI research platform for institutional investors, systematically translating large language model capabilities into deliverable data products and analytical tools. The core focus of this role is researching how to extract high-quality signals from unstructured financial information using LLMs, and turning those research findings into product features that directly serve institutional investment workflows.
Key Responsibilities
- Financial Text Signal Research & Data Product Development
- Apply LLMs (open-source and proprietary) to extract structured information from large volumes of unstructured financial text, including corporate filings (10-K/10-Q/annual reports), earnings call transcripts, analyst reports, macro news, and regulatory announcements;
- Design and validate LLM-derived quantitative metrics (sentiment scores, event classifications, changes in forward-looking statements, etc.), evaluating their statistical quality and practical value as deliverable datasets for institutional clients;
- Package validated research outputs into standardized data products (structured datasets, scoring series, research reports) ready for client integration.
- LLM Application Development & Product Delivery
- Design and iterate prompt engineering solutions for financial use cases, including Chain-of-Thought, Structured Output, and Few-shot approaches, to improve extraction accuracy and consistency;
- Build RAG (Retrieval-Augmented Generation) systems enabling intelligent Q&A and automated summarization grounded in historical research reports, regulatory filings, and market data;
- Design LLM Agent workflows to automate research processes such as data retrieval, analysis, and report generation, accelerating internal research efficiency and product iteration;
- Collaborate with product and engineering teams to move research prototypes into production, ensuring reliability and scalability.
- Evaluation Framework & Quality Assurance
- Build LLM evaluation frameworks for financial scenarios, covering dimensions such as extraction accuracy, hallucination rate, and output consistency, to inform model selection and iteration decisions;
- Design data quality monitoring pipelines to ensure the stability and auditability of datasets and signals delivered to clients;
- Manage cloud-based LLM inference pipelines, optimizing API cost control and batch processing throughput.
- Research Tracking & Client Insight Translation
- Continuously track the latest developments from NLP/LLM conferences (ACL, EMNLP, NeurIPS, ICML) and the financial AI space, rapidly assessing productization potential;
- Identify new data product opportunities based on client feedback and market demand, driving the full cycle from research idea to deliverable product.
Key Requirements
- Master's or PhD from a reputable university in Computer Science, Artificial Intelligence, Statistics, Mathematics, Physics, or related fields; exceptional undergraduates will also be considered;
- Solid foundations in machine learning and NLP, with a clear understanding of the Transformer architecture and how mainstream LLMs work;
- Proficient in Python and experienced in working with LLMs, capable of independently building end-to-end data processing and model evaluation pipelines;
- Strong research discipline — from data cleaning and experiment design through to result validation — with an emphasis on reproducibility and statistical rigor;
- Strong English reading and writing skills; able to read academic papers, technical documentation, and financial texts fluently;
- Ability to define problems clearly and think in terms of productization — translating ambiguous client needs into concrete technical solutions;
- Must hold a valid Hong Kong work visa (IANG, Quality Migrant Admission Scheme, or Top Talent Pass Scheme) or Hong Kong permanent residency.
Bonus Points
- Hands-on experience building production-grade LLM applications (RAG, Agents, structured output, etc.);
- Familiarity with financial text data (SEC filings, Bloomberg/Wind news, earnings call transcripts) and their processing and analysis;
- Experience constructing NLP-derived quantitative factors and conducting backtests;
- Practical experience with LLM fine-tuning (SFT/DPO) or model evaluation (Evals);
- Basic understanding of institutional investor (fund/asset management) research workflows, with the ability to communicate technical findings clearly.