Salesforce's Data & Analytics organization is looking for a Decision Scientist who thrives at the intersection of AI, automation, and rigorous quantitative analysis. In this role, you'll build the measurement frameworks that define how we evaluate agentic systems, run high-stakes experiments, and directly shape product and business strategy through data-driven storytelling. This is a senior individual contributor role with real executive visibility and the opportunity to pioneer how an enterprise thinks about AI reliability and impact.
What You'll Do
- Build evaluation frameworks for AI systems — Design and scale methodologies to assess the performance, reasoning quality, and reliability of agentic workflows, including LLM-as-a-judge metrics and approaches tailored to non-deterministic outputs.
- Solve complex attribution problems — Develop causal inference models that distinguish true incremental gains from organic trends, using the right modeling technique for each problem.
- Lead experimental design — Own the design and analysis of sophisticated experiments — from multivariate and switchback designs to quasi-experimental methods for environments where randomization isn't feasible.
- Set the statistical standard — Serve as the department's final reviewer for statistical methodology, ensuring rigor is appropriately calibrated to the stakes of each analysis.
- Influence strategy through storytelling — Translate complex quantitative findings into clear, actionable narratives for executive leadership, shaping both the product roadmap and long-term business direction.
What We're Looking For
Required Qualifications
- 8+ years of experience in a quantitative role, with a proven track record deploying causal models or experimental frameworks in production environments
- Deep expertise in causal inference, high-dimensional regression, time-series analysis, and forecasting
- Strong proficiency in Python or R (PyData stack: Pandas, NumPy, SciPy, Statsmodels, Scikit-learn)
- Expert-level SQL skills for complex data extraction, feature engineering, and query performance tuning in cloud data warehouses (e.g., Snowflake, BigQuery)
- Demonstrated ability to communicate statistical findings clearly to non-technical executive audiences
Preferred Qualifications
- Familiarity with LLM evaluation metrics and the unique statistical challenges of non-deterministic AI systems
- Experience working on or alongside agentic or AI product teams
- Background in experimental economics or operations research
Education
A related technical degree required.