Dice is the leading career destination for tech experts at every stage of their careers. Our client, Stanley David and Associates, is seeking the following. Apply via Dice today!
Job Title: BigData ,Python and Py Spark , Google Cloud Platform
Location: Phoenix, AZ
Job Type: Fulltime
Job Description:
Roles and Responsibilities:
- Develop and maintain data pipelines using BigData processes
- Focus on ingesting, storing, processing, and analyzing large datasets
Required skills and qualifications:
- Design, develop, and maintain scalable ETL/ELT pipelines using PySpark, Airflow, and Google Cloud Platform-native tools.
- Build and optimize data warehouses and analytics solutions in BigQuery.
- Implement and manage workflow orchestration with Airflow/Cloud Composer.
- Write complex SQL queries for data transformations, analytics, and performance optimization.
- Ensure data reliability, security, and governance across pipelines.
- Conduct performance tuning and cost optimization of BigQuery and PySpark workloads.
- Collaborate with analysts and product teams to deliver reliable data solutions.
- Troubleshoot, debug, and resolve production issues in large-scale data pipelines.
- Contribute to best practices, reusable frameworks, and automation for data engineering.
- 5+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be a addon
- Expert on Distributed ecosystem
- Hands-on experience with programming using Python
- Expert on Hadoop and Spark Architecture and its working principle
- Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes),
- optimizing joins while processing huge amount of data
- Experience in UNIX shell scripting Ability to design and develop optimized Data pipelines for batch and real time data processing
- Should have experience in analysis, design, development, testing, and implementation of system applications
- Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.