Sr. Data Engineer

Starcom Mediavest Group Germany Gmbh

Toronto, CA; US

On-site

Job Description

We are looking for a Senior Data Engineer to build and operate production-grade data pipelines and Lakehouse models that power a large-scale customer and marketing data ecosystem. You will help transform complex, multi-source data into trusted datasets and curated layers that support reporting, advanced analytics, and downstream data products.

You will partner with product, strategy, analytics, and engineering teams to define data contracts and SLAs, implement ingestion and transformation patterns, and ensure strong data quality, lineage, governance, and observability in production. The role is hands-on in a cloud lakehouse environment and requires an ownership mindset across the lifecycle-from design and build to operations, cost/performance optimization, and secure handling of sensitive and privacy-regulated data (including data residency constraints).

Responsibilities

Data Pipelines & Ingestion

Design, build, and operate reliable batch and streaming pipelines to onboard data from APIs, databases, event streams, and file drops into the lakehouse.

Implement scalable ETL/ELT transformations (Spark/SQL) and curate data through standardized layers (e.g., bronze/silver/gold) with clear ownership and SLAs.

Integrate multiple sources and implement consistent keys, business rules, and validation checks in partnership with platform and analytics teams.

Build automated data quality checks, tests, and reconciliation to ensure trusted, production-grade outputs.

Design for multi-environment delivery: implement strong access control patterns and secure handling of sensitive data.

Data Modeling & Serving

Design analytics-friendly models (dimensional and/or wide-table) and publish curated datasets for BI and self-service use cases.

Build and optimize lakehouse structures to support downstream analytics applications and advanced analytics workflows.

Define consistent metric logic and implement semantic modeling patterns to enable governance, reuse, and self-service analytics.

Optimize performance and cost via partitioning, clustering, file sizing, and workload tuning across storage and compute.

Maintain lineage and documentation so stakeholders can understand dataset meaning, freshness, and limitations.

Platform, Orchestration & Operations

Build and maintain orchestration workflows (e.g., ADF/Airflow) with robust retry, alerting, backfill, and dependency management.

Build and maintain modeling and retraining pipelines, including automated scheduling, monitoring, and safe promotion across environments.

Develop and operate in an Azure + Databricks ecosystem; implement lakehouse best practices (Delta formats, job design, cluster policies) for reliable production performance.

Implement CI/CD for data pipelines and infrastructure-as-code to enable safe, repeatable releases across environments.

Establish observability (logging, monitoring, data freshness/volume checks) and incident response practices; troubleshoot production issues end to end.

Apply governance and security controls (RBAC, secrets management, encryption, auditability) and support catalog/permission models (e.g., Unity Catalog or equivalents).

Collaboration & Client Enablement

Work closely with cross-functional teams (product, engineering, analytics) to deliver end-to-end data solutions for business-critical data products.

Drive alignment on data definitions, contracts, and operational expectations (SLAs/SLOs), including privacy, residency, and governance constraints.

Guide technical decisions through clear documentation, trade-off discussions, and pragmatic architecture recommendations.

Mentor engineers through pairing, code reviews, and best practices around testing, reliability, and operational excellence.

Qualifications

6+ years of experience in data engineering (or equivalent), building and operating data pipelines and platforms in production.

Strong programming skills in Python and SQL; solid software engineering practices (clean code, automated testing, code reviews, Git).

Hands-on experience with distributed processing (Spark) and designing efficient transformations at scale.

Strong experience building data solutions on cloud platforms-preferably Azure-with practical knowledge of IAM, networking, storage, and cost optimization.

3+ years of experience with Databricks (jobs, notebooks, workflows) and lakehouse concepts (Delta/data lakes/data warehouses) including data modeling for analytics.

Proficiency with orchestration and scheduling tools (e.g., Azure Data Factory, Airflow) and CI/CD for data workloads.

Strong understanding of data quality, observability, governance, and security (access control, encryption, PII handling).

Comfortable working in Agile delivery environments and collaborating with client stakeholders.

Nice to Have

Experience working with customer/marketing data, complex entity relationships, and privacy-preserving data collaboration approaches.

Skills & Requirements

Technical Skills

Data pipelinesData modelingData qualityData governanceData observabilityData securityData residencyEtl/eltSparkSqlAzureDatabricksDelta formatsJiraAgileCustomer/marketing dataComplex entity relationshipsPrivacy-preserving data collaborationCommunicationCollaborationData engineeringCloud lakehouseAdvanced analyticsBiSelf-service analyticsData residencyPrivacy

Employment Type

FULL TIME

Level

senior

Posted

4/25/2026

Apply Now

You will be redirected to Starcom Mediavest Group Germany Gmbh's application portal.