What The Role Is
We are seeking a talented Senior Data Engineer to design, build, and maintain our data infrastructure supporting mission-critical energy operations. You'll work at the intersection of renewable energy and data, developing pipelines that process everything from real-time asset performance data to complex trading and risk analytics. This hybrid role offers the opportunity to make a direct impact on clean energy operations while working with a cutting-edge data stack including Snowflake, Dagster, dbt, Modal, and GitLab.
What You'll Be Doing
Analytics Infrastructure & Data Warehouse Management
- Design, deploy, and maintain scalable data infrastructure to support enterprise analytics and reporting needs
- Manage Snowflake instances, including performance tuning, security configuration, and capacity planning for growing data volumes
- Optimize query performance and resource utilization to control costs and improve processing speed
Data Pipeline Development & Orchestration
- Build and orchestrate complex ETL/ELT workflows using Dagster to ensure reliable, automated data processing for asset management and energy trading
- Develop robust data pipelines that handle high-volume, time-sensitive energy market data and asset generation and performance metrics
- Implement workflow automation and dependency management for critical business operations
Data Transformation & Analytics Support
- Develop and maintain dbt models to transform raw data into business-ready analytical datasets and dimensional models
- Create efficient SQL-based transformations for complex energy market calculations and asset performance metrics
- Support advanced analytics initiatives through proper data preparation and feature engineering
Data Quality & Governance
- Implement comprehensive data validation, testing, and monitoring frameworks to ensure accuracy and consistency across all energy and financial data assets
- Establish data lineage tracking and privacy controls to meet regulatory compliance requirements in the energy sector
- Develop alerting and monitoring systems for data pipelines, including error handling, SLA monitoring, and incident response
CI/CD & DevOps
- Lead continuous integration and deployment initiatives for Dagster and dbt pipelines, and Streamlit/Gradio application deployments to Linux servers
- Implement automated testing and deployment automation for data pipelines and analytics applications
- Manage version control and infrastructure as code practices
Cross-functional Collaboration
- Partner with Analytics Engineers, Data Scientists, and business stakeholders to understand requirements and deliver solutions
- Work closely with asset management and trading groups to ensure real-time data availability for market operations and risk calculations
- Collaborate with credit risk teams to develop data models supporting financial analysis and regulatory reporting
- Translate business requirements into technical solutions and communicate data insights to stakeholders
Documentation & Security
- Create and maintain technical documentation, data dictionaries, and onboarding materials for data assets
- Implement role-based access controls, data encryption, and security best practices across the data stack
- Monitor and optimize cloud infrastructure costs, implement resource allocation strategies, and provide cost forecasting
What You'll Bring
- Experience: 4+ years of hands-on data engineering experience in production environments
- Education: Bachelor's degree in Computer Science, Engineering, or a related field
- Data Orchestration: Proficiency in Dagster for pipeline scheduling, dependency management, and workflow automation; Airflow experience a plus
- Cloud Data Warehousing: Advanced-level Snowflake administration, including virtual warehouses, clustering, security, and cost optimization
- Data Transformation: Proficiency in dbt for data modeling, testing, documentation, and version control of analytical transformations
- Programming: Strong Python and SQL skills for data processing and automation
- CI/CD: 3+ years of experience with continuous integration and continuous deployment practices and tools; proficiency in GitLab CI/CD required (GitHub Actions experience a plus)
- Database Technologies: Advanced SQL skills, database design principles, and experience with multiple database platforms
- Cloud Platforms: Proficiency in AWS/Azure/GCP data services, storage solutions (S3, Azure Blob, GCS), and infrastructure as code
- Data Integration: Experience with APIs, various data connectors, and formats
- Data Security: Understanding of data security best practices, access controls, encryption, and role-based access management
- AI & LLM Proficiency: Practical experience integrating and leveraging large language models (e.g., OpenAI, Anthropic, or open-source models) within data workflows; ability to apply LLMs efficiently and securely, with awareness of data privacy boundaries, prompt injection risks, and respo