Local candidates only - in-person interview is a MUST
Duration - 6+ months
Location - Chicago Downtown - onsite Tue, Wed and Thu
Key Responsibilities
- Design, build, and support end-to-end data pipelines (ingestion, transformation, validation, publishing).
- Develop and optimize SQL and PySpark/Databricks transformations for large datasets.
- Build production-grade Python components (reusable modules, logging, error handling, testing).
- Create and maintain Azure Data Factory (ADF) pipelines (triggers, parameterization, monitoring, failure handling).
- Work within Azure environments (ADLS Gen2, Azure SQL, resource groups, portal operations).
- Provision and maintain Azure components using Pulumi (Infrastructure as Code).
- Participate in code reviews, documentation, and operational support
Must-have Skills (Required)
- 7+ years experience as a data engineer
- 3+ years doing ETL / ELT Concepts: Strong understanding of pipeline patterns, incremental loads, data validation, and troubleshooting.
- 3+ years in SQL: Advanced querying (CTEs, views, joins, complex query logic) and performance tuning for transformations and validation.
- 2+ years using Python: Production-quality development (modular code, testing, logging, integration with APIs/files, CICD, Unit Test/Integration test automation, Code Coverage).
- PySpark: Distributed transformations and performance optimization, CICD, Unit Test/Integration test automation, Code Coverage.
- 2+ years using Azure Data Factory (ADF)
- 2+ year using Databricks
- Azure Fundamentals + Pulumi: Hands-on with ADLS Gen2, Azure Portal, Storage Explorer, Resource Groups, Azure SQL, and familiarity integrating with Azure OpenAI. Able to use/maintain Pulumi scripts for provisioning and managing Azure resources across environments.
Nice-to-have Skills
- R: Ability to support/translate validation rules with SQL scripts and create data quality reports.
- TypeScript: Useful for pulumi pipeline to create Azure components.
- Java: Useful for integration with existing services/components.
- .NET: Useful for integration with existing services/components.
- Angular / Spring Boot: Minor troubleshooting or coordination with app teams.
Domain : Clinical (Health care) – Nice to have.
#TECH #Chicagohybrid