Derived from job-description analysis by Serendipath's career intelligence engine.
Original posting from Collabera
Title: Senior Data Engineer
Client:
Investments Industry
# of Openings:
1
Type:
6-Month Contract (High likelihood of extension)
Location:
Toronto, ON
Work Model:
4 days/week onsite, Friday WFH
PR:
$80-100/hr
Role Overview
- We are seeking a
Senior Data Engineer (8-10+ years experience)
to support a large-scale data platform transformation within the Total Fund Management (TFM) team.
- This role will focus on
migrating and modernizing existing Databricks-based pipelines to AWS (EMR Spark)
, with an initial
lift-and-shift phase
, followed by
optimization and redesign into scalable, consumable data products
.
- This is a
highly autonomous, hands-on role
requiring strong PySpark expertise, deep experience with distributed data systems, and the ability to navigate complex, multi-source datasets (including market and reference data vendors).
Day-to-Day Responsibilities
- Migrate existing
Databricks-based Spark pipelines to AWS EMR (Spark)
- Perform
lift-and-shift of ~50+ datasets
, some with high complexity and multiple data sources
- Refactor and optimize data pipelines for
performance, scalability, and reliability
- Structure and store data using
Parquet and Iceberg
formats
- Improve and clean up legacy data pipelines built over several years
- Design data with a
consumption-first mindset
(e.g., partitioning strategies, access patterns, data usability)
- Collaborate with stakeholders to understand data requirements and translate into scalable solutions
- Ensure production readiness including
monitoring, orchestration, and deployment
- Work independently to drive delivery from design through implementation
Key Responsibilities
- Develop and optimize
large-scale PySpark data pipelines
- Rebuild and enhance Spark workloads in
AWS (EMR)
- Leverage tools such as
Airflow, AWS Glue, and Lake Formation
- Handle
parallel/distributed data processing workloads
- Improve system performance and data quality across pipelines
- Engage with business and technical stakeholders to align on data needs
- Own delivery with minimal oversight in a fast-paced environment
Must-Haves
- 8-10+ years of Data Engineering experience
(senior-level profiles only)
- Strong hands-on expertise in
Python and PySpark
- Deep experience with
Apache Spark in distributed environments
- Proven experience working with
large-scale, complex data pipelines
- Experience with
Databricks
(existing environment)
- Strong knowledge of
Parquet and Iceberg
data formats
- Experience with
AWS data ecosystem (EMR preferred)
- Familiarity with
Airflow, Glue, and Lake Formation
- Strong understanding of
parallel/distributed data processing
- Ability to work independently with strong problem-solving skills
- Experience in ambiguous environments with evolving requirements
Nice-to-Haves
- Prior experience in
capital markets or investment management
- Experience working with
market data / reference data vendors
- Experience designing
data products and consumption layers
- Exposure to large-scale
data platform migrations or transformations
We may use AI-enabled and/or automated tools to support parts of our recruitment process, including application screening, interview scheduling, and candidate communications. These tools are used to enhance consistency and efficiency. All hiring decisions involve human review and are not based solely on automated processing.
The Company offers a total rewards package in accordance with all applicable federal, provincial, and local laws and requirements. Benefit eligibility and offerings vary based on role, employment status, and work location. For contractor positions, benefits are limited to those entitlements and protections required by applicable law, which may include (as applicable) vacation pay, public holidays, leaves of absence, and other legally mandated benefits or payments.
Source: Collabera careers