Senior Data Scientist – International eKYC, Identity Graph

Socure
Washington, US
Remote

Job Description

Job Description:

  • Lead the design, development, and deployment of ML and graph-based algorithms for international entity resolution, identity trust scoring, and anomaly detection across heterogeneous, country‑specific datasets.
  • Architect reusable matching and linking frameworks that work across multiple ID schemes (e.g., national ID numbers, passports, voter IDs, mobile accounts, bank accounts) and local name/address conventions.
  • Develop probabilistic and rule‑augmented models that handle noisy, sparse, or partially labeled international data while maintaining explainability and regulatory defensibility.
  • Define and evolve the international extension of Socure’s identity graph: schema design, linkage strategies, quality tiers, and confidence scoring that can be leveraged by multiple products (Verify, KYC, watchlists, fraud).
  • Design and implement robust data quality and monitoring frameworks for international identity data (coverage, stability, drift, regional bias, label quality) and integrate them into modeling and production monitoring workflows.
  • Own experimentation strategy for major international eKYC initiatives: Design offline evaluations and online A/B tests that reflect local ground truth constraints and data sparsity.
  • Define success metrics that balance approval rates, fraud capture, and regulatory/operational constraints per market.
  • Analyze lift, stability, and fairness trade‑offs and drive go/no‑go decisions with Product and Engineering.
  • Contribute to model governance documentation and support responses to regulators and large enterprise customers regarding model logic, data provenance, fairness, and monitoring for international markets.

Requirements:

  • Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field, or equivalent practical experience.
  • 6+ years of hands-on applied ML / data science experience (4+ with Ph.D.), including owning production models and pipelines in high‑stakes domains (fraud, risk, identity, payments, credit, or similar).
  • Significant prior work on international or multi‑region products is strongly preferred (e.g., cross‑country KYC, credit risk, payments, or compliance systems).
  • Expert‑level proficiency in Python and SQL, with extensive experience in distributed data processing (Spark/PySpark, Databricks or similar) on very large datasets.
  • Deep experience designing, training, and deploying models for classification, ranking, anomaly detection, and/or graph learning, including:
  • Feature engineering for noisy/heterogeneous identity data.
  • Robust evaluation under label sparsity and feedback delays.
  • Calibration and thresholding tailored to regional risk and regulatory constraints.
  • Proven expertise with graph technologies (e.g., Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms (entity resolution, link prediction, community detection, label propagation) at scale.

Benefits:

  • Offers Equity
  • Offers Bonus

Skills & Requirements

Technical Skills

PythonSqlSparkPysparkDatabricksNeo4jAws neptuneGraphframesDglPytorch geometric

Employment Type

FULL TIME

Level

senior

Posted

4/24/2026

Apply Now

You will be redirected to Socure's application portal.