Responsibilities
Content Security Algorithm Research Team:
The International Content Safety Algorithm Research Team is dedicated to maintaining a safe and trustworthy environment for users of ByteDance's international products. We develop and iterate on machine learning models and information systems to identify risks earlier, respond to incidents faster, and monitor potential threats more effectively. The team also leads the development of foundational large models for products.
In the R&D process, we tackle key challenges such as data compliance, model reasoning capability, and multilingual performance optimization. Our goal is to build secure, compliant, and high-performance models that empower various business scenarios across the platform, including content moderation, search, and recommendation.
Research Project Background In recent years, Large Language Models (LLMs) have achieved remarkable progress across various domains of natural language processing (NLP) and artificial intelligence. These models have demonstrated impressive capabilities in tasks such as language generation, question answering, and text translation.
However, reasoning remains a key area for further improvement. Current approaches to enhancing reasoning abilities often rely on large amounts of Supervised Fine-Tuning (SFT) data. However, acquiring such high-quality SFT data is expensive and poses a significant barrier to scalable model development and deployment.
To address this, OpenAI's o1 series of models have made progress by increasing the length of the Chain-of-Thought (CoT) reasoning process. While this technique has proven effective, how to efficiently scale this approach in practical testing remains an open question. Recent research has explored alternative methods such as Process-based Reward Model (PRM), Reinforcement Learning (RL), and Monte Carlo Tree Search (MCTS) to improve reasoning.
However, these approaches still fall short of the general reasoning performance achieved by OpenAI's o1 series of models. Notably, the recent DeepSeek R1 paper suggests that pure RL methods can enable LLM to autonomously develop reasoning skills without relying on the expensive SFT data, revealing the substantial potential of RL in advancing LLM capabilities.
Project Challenges:
Qualifications
Bonus Points
senior
5/1/2026
You will be redirected to ByteDance's application portal.
Sign in and we'll score your resume against this role.