Role: Data Engineer \ DataBricks
Location: Hybrid- Austin, TX
Role Summary
The Enterprise Data Engineer will design, build, and operate scalable data pipelines within an Azure environment in a Databricks Lakehouse architecture, with a primary focus on delivering and sustaining a software-driven data model for analytics and data consumption. This role is hands-on and execution-focused, supporting the project by engineering reliable ingestion from a diverse set of data producers, transformation, data quality checks, and datasets integrated with ServiceNow (ITSM/ITSLM) and ApptioOne (ITFM).
The Data Engineer partners closely with data architects, platform teams, providers, and stakeholders to translate architectural designs into implemented, performant, governed, and production-ready data solutions expanding the platform using Agile Software Engineering methodologies (e.g. GitHub and SDLC based on CI/CD).
Key Responsibilities
- Build and maintain data models to support data use and consumption, data integration with key systems, semantic analytics, reporting, and executive dashboards.
- Develop scalable data ingestion and transformation pipelines using a combination of Azure PaaS, Databricks, Delta Lake, Python and Spark SQL.
- Engineer integrations for ServiceNow operational and SLA datasets and ApptioOne financial and cost allocation data.
- Implement data quality checks, validation rules, and monitoring for end-to-end pipeline reliability.
- Apply Unity Catalog governance controls, including data access, lineage, and schema enforcement, as defined by architectural standards.
- Optimize pipeline performance, storage layouts, and query efficiency within the Databricks Lakehouse.
- Support CI/CD pipelines and DevOps automation for data engineering workflows using Azure DevOps and GitHub Actions.
- Collaborate with architects, client stakeholders, Capgemini teams, and service providers to deliver agreed reporting and analytics outcomes.
- Troubleshoot production data issues and support operational stability of analytics and reporting solutions.
- Contribute to documentation, runbooks, and operational standards for Databricks data pipelines.
Required Skills & Experience
- 5 years of experience in Data Engineering or Analytics Engineering roles.
- Hands-on experience with Databricks, Delta Lake, and Spark-based data pipelines.
- Strong understanding of Medallion Architecture, particularly Gold/Platinum layer implementation.
- Proficiency in Python, SQL, and Spark (PySpark or SQL).
- Experience integrating enterprise systems such as ServiceNow (SLA, incident, CMDB data).
- Experience working with financial or cost management data (e.g., ApptioOne or equivalent ITFM tools).
- Experience with data modeling methodologies and tools.
- Familiarity with Unity Catalog concepts for data governance and access control.
- Experience with Power BI or similar BI tools consuming curated Lakehouse datasets.
- Experience with Azure data platform services (e.g., ADLS Gen2, Azure-native orchestration, and integration patterns), Azure DevOps, and GitHub-based CI/CD pipelines.
Preferred Qualifications
- Experience supporting public sector data initiatives.
- Familiarity with ITIL 4 / ITIL 5 concepts and SLA-based reporting.
- Experience supporting financial systems, SLA analytics, operational KPIs, or cost transparency dashboards.
- Exposure to MLflow, Feature Store, or AI/ML enablement pipelines (implementation support rather than architecture ownership).