Data Scientist with Python expertise in New York

Capgemini
New York, US
On-siteVisa Sponsorship

Job Description

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.

Onsite : New York

Job Description

Key Responsibilities

Attribution & Measurement Modeling

  • Build and maintain multi-touch attribution (MTA) models - touch-order aware, channel-weighted, with incremental lift quantification across owned, paid, and clean room channels
  • Develop cohort-level LTV/CAC scoring models using transaction signals, behavioral features (SHAP-ranked), and propensity scores - deployed at segment and micro-cohort resolution
  • Design holdout and matched-market test frameworks for measuring incrementality across CTV, display, paid search, and social channels
  • Build probabilistic identity linkage models for household graph construction and cross-device resolution where deterministic signals are absent

Audience Intelligence

  • Develop SHAP-based feature importance pipelines for audience signal ranking - surfacing top predictive signals per segment for AI-generated audience briefs
  • Build behavioral micro-cohort clustering using unsupervised and semi-supervised methods on transaction and lifestyle features - producing 10+ interpretable sub-cohorts per major audience segment
  • Design suppression, exclusion, and lookalike model pipelines that feed into DSP activation and clean room audience delivery

AI Integration & Insight Generation

  • Collaborate with engineering to design system prompts, structured output schemas, and evaluation frameworks for AI-powered audience authoring, measurement intelligence, and campaign brief generation
  • Build model evaluation pipelines comparing AI-generated audience segments against held-out conversion actuals, benchmarking performance vs. deterministic baselines
  • Develop geo-level DMA performance models: LTV/CAC opportunity mapping, state-vs-DMA benchmarking, and priority zone classification for campaign planning
  • Author AI-assisted insight narratives - translating model outputs into plain-language recommendations surfaced to client marketing teams through the platform UI

Required Qualifications

  • 5+ years applied data science experience
  • Expert Python proficiency: scikit-learn, XGBoost or LightGBM, SHAP, pandas, statsmodels, and at least one deep learning framework for production model development
  • Deep expertise in multi-touch attribution methodologies: MTA, media mix modeling (MMM), incrementality testing, and controlled experiment design
  • Experience building LTV, propensity, and CAC models on financial transaction or behavioral data at segment and sub-segment resolution
  • Comfort operating inside data clean rooms - designing models that run on privacy-preserving aggregates rather than individual-level raw data
  • Strong statistical foundations: causal inference, Bayesian methods, survival analysis, and experiment design
  • Fluent SQL across cloud data warehouses (Snowflake, BigQuery, Redshift, or equivalent) and experience working with ML platforms such as Vertex AI, SageMaker, or Databricks MLflow
  • Ability to translate complex model outputs into business narratives for VP- and C-level marketing stakeholders

Preferred Qualifications

  • Experience designing AI-augmented analytics workflows - using LLM APIs for structured output generation, signal summarization, or compliance pre-screening alongside traditional models
  • Familiarity with walled garden measurement environments: Google ADH, Meta Analytics API, Amazon Attribution
  • Graph-based modeling experience - using Neo4j, Amazon Neptune, or similar for identity linkage, co-purchase signals, or household relationship modeling
  • Demonstrated expertise in identity resolution, household modeling, or cross-device attribution at scale"

The base compensation range for this role in the posted location is: $100000 to $130000

Capgemini provides compensation range information in accordance with applicable national, state, provincial, and local pay transparency laws. The base compensation range listed for this position reflects the minimum and maximum target compensation Capgemini, in good faith, believes it may pay for the role at the time of this posting. This range may be subject to change as permitted by law.

The actual compensation offered to any candidate may fall outside of the posted range and will be determined based on multiple factors legally permitted in the applicable jurisdiction.

These may include, but are not limited to: Geographic location, Education and qualifications, Certifications and licenses, Relevant experience and skills, Seniority and performance, Market and business consideration, Internal pay equity.

It is not typical for candidates to be hired at or near the top of the posted comp

Skills & Requirements

Technical Skills

Pythonscikit-learnXGBoostLightGBMSHAPpandasstatsmodelsdeep learningmulti-touch attributionmedia mix modelingincrementality testingcontrolled experiment designLTVpropensityCAC modelstransaction signalsbehavioral featurespropensity scoresholdoutmatched-market test frameworksCTVdisplaypaid searchsocial channelsSHAP-based feature importance pipelinesaudience signal rankingbehavioral micro-cohort clusteringtransaction and lifestyle featuressuppressionexclusionlookalike model pipelinesDSP activationclean room audience deliverysystem promptsstructured output schemasevaluation frameworksAI-powered audience authoringmeasurement intelligencecampaign brief generationmodel evaluation pipelinesAI-generated audience segmentsheld-out conversion actualsbenchmarking performancegeo-level DMA performance modelsLTV/CAC opportunity mappingstate-vs-DMA benchmarkingpriority zone classificationcampaign planningAI-assisted insight narrativesplain-language recommendationsplatform UIdata sciencemachine learningartificial intelligenceaudience intelligenceattribution & measurement modeling

Level

mid

Posted

4/9/2026

Apply Now

You will be redirected to Capgemini's application portal.