AI Research Scientist - Text Data

Meta
Seattle; Washington, US
On-site

Job Description

Summary:

Meta is on the lookout for talented AI research scientists who are passionate about shaping the future of Large Language Models. As part of our team, you'll have the opportunity to work with data at a grand scale while tackling challenges that redefine what’s possible in the field. You'll play a key role in our data curation efforts across all stages of LLM development, including pre-training, mid-training, and post-training, while engaging with various domains and modalities, such as web, code, agent, and multilingual data.

Responsibilities:

  • Collaborate with diverse teams to develop Meta's foundational models.
  • Enhance our understanding of data research, focusing on overcoming challenges related to data walls and synthetic data creation.
  • Boost data velocity throughout our workflows and projects by advancing our data tooling.
  • Design scalable and efficient data curation systems and pipelines.
  • Drive high-priority projects in data curation across the training lifecycle.
  • Leverage specialized expertise in areas such as synthetic data, reasoning data, web parsing, and data scaling.
  • Lead complex technical projects from start to finish.

Minimum Qualifications:

  • Bachelor's degree in Computer Science, Computer Engineering, or a relevant technical discipline, or equivalent practical experience.
  • PhD in Computer Science or a related technical field.
  • 2+ years of industry research experience in LLM, NLP, or related AI/ML models.
  • Proven experience as a technical lead, managing significant initiatives with cross-functional impact.
  • Hands-on experience with pre-training or mid-training data curation for large foundational models and familiarity with organic, synthetic, and agentic data.
  • Published research in notable peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) and/or significant industry influence in AI.

Preferred Qualifications:

  • Experience with state-of-the-art Large Language Models.
  • Multiple first-author publications in prestigious peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP).
  • Hands-on experience with modeling frameworks like PyTorch.
  • Experience with SQL and large-scale data management, along with knowledge of tools like Spark and Hive.

Public Compensation:

$184,000/year to $257,000/year + bonus + equity + benefits

Industry: Internet

Equal Opportunity:

Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based on any protected characteristics. We also support applicants with criminal histories, as permitted by law. Meta participates in the E-Verify program where required.

Meta is committed to providing accommodations for candidates with disabilities during the recruitment process. If you require assistance or accommodation, please let us know.

Skills & Requirements

Technical Skills

PytorchSqlSparkHiveAiMlNlp

Salary

$184,000 - $257,000

year

Employment Type

FULL TIME

Level

senior

Posted

4/20/2026

Apply Now

You will be redirected to Meta's application portal.