This position will be filled in the Data Analytics team within Technology Services. Data Engineer will be responsible for designing, developing, and maintaining our data infrastructure. This role involves working closely with data scientists, analysts, and other stakeholders to ensure that data pipelines are efficient, reliable, and scalable. The ideal candidate will have a strong background in data engineering, a passion for technology, and a commitment to delivering high-quality solutions.
Education/Experience:
- Bachelor’s degree in computer science/engineering or related degree preferred
- Experience with implementing analytics solutions using the Microsoft analytics toolset and Microsoft Azure
- Experience working in a fast-paced, competitive information technology organization
- Exempt 05 requires minimum 3 years of experience
- Exempt 06 requires minimum 5 years of experience
Knowledge/Skills:
- Strong Programming Skills
- Languages: Proficient in languages like SQL, Python, or Java. These are essential for building data pipelines, scripting automation, and manipulating data.
- Efficiency: Writing efficient, scalable, and clean code to handle large datasets and optimize performance.
- Deep Knowledge of Databases & Data Warehousing
- SQL and NoSQL: Mastery in relational databases (like MS SQL Server, MySQL, PostgreSQL, Oracle) and NoSQL databases (like MongoDB, Cassandra, DynamoDB), knowing when and how to use each type.
- Data Warehousing: Expertise in using platforms like MS SQL Server, Oracle, Amazon Redshift, Google BigQuery, Snowflake, or traditional data warehouses to store, query, and manage large volumes of data.
- Database Optimization: Skills in optimizing queries, indexing, partitioning, and designing efficient database schemas for high performance.
- Expertise in Data Pipeline Construction
- ETL/ELT Processes: Proven ability to design and build robust ETL pipelines (Extract, Transform, Load) for collecting, cleaning, and moving data.
- Real-time and Batch Processing: Experience working with both batch and real-time data processing frameworks (e.g., Apache Kafka, Apache Flink, Apache Spark, Databricks).
- Data Orchestration: Familiarity with orchestration tools like Apache Airflow.
- Strong knowledge of Medallion Architecture
- Cloud Services: Proficiency in cloud platforms like Azure* (SQL Database, Data Lake, Lake House), AWS (S3, Lambda, Redshift), or Google Cloud (BigQuery, Dataflow). *Azure is preferred.
- Scalability: Ability to design systems that scale efficiently in the cloud, handling big data and increasing demand without sacrificing performance.
- Data Transformation and Cleaning Skills
- Data Quality Management: Experience in data cleansing, validation, and transformation, ensuring that data is accurate, complete, and in the right format for analysis.
- Data Integration: Expertise in integrating data from diverse sources (internal and external) while resolving issues like inconsistency or format mismatches.
- Performance Optimization and Troubleshooting
- Query Optimization: Ability to fine-tune queries, databases, and pipelines to reduce latency, optimize resource usage, and speed up data processing.
- System Monitoring: Familiarity with monitoring systems and logging tools to detect, diagnose, and resolve performance or data issues.
- Data Interpretation: Ability to translate business requirements into technical solutions, ensuring the correct data is collected and processed for reporting, analytics, and decision-making.
- Problem Solving: A strong ability to troubleshoot and resolve complex data challenges or inconsistencies that can affect the integrity and availability of data.
- Knowledge of Big Data Tools
- Big Data Frameworks: Familiarity with big data technologies such as Spark or Flink for processing large datasets across distributed systems.
- Data Lakes and Data Pipelines: Experience with data lakes (e.g., Azure Lakehouse, AWS S3, HDFS) for storing raw and unstructured data and building pipelines to process it efficiently.
- Collaboration and Communication Skills
- Cross-functional Collaboration: Proven ability to work closely with data scientists, analysts, and other stakeholders to understand data needs and deliver optimal solutions.
- Clear Communication: Ability to explain technical concepts to non-technical stakeholders, ensuring that data infrastructure decisions align with business goals.
- Data Security & Governance Awareness
- Data Privacy: Knowledge of data privacy regulations (e.g., GDPR, CCPA) and ensuring that systems comply with these laws while managing sensitive data.
- Access Control: Implementing strong data access controls, encryption, and monitoring to secure data both at rest and in transit.
- Adaptability to New Tools and Technologies
- Continuous Learning: A strong commitment to keeping up with the rapidly evolving tech landscape, exper