Data Platform Engineer (Databricks/Pyspark) - W2

Aroha Technologies
Houston, US
Hybrid

Job Description

Job Title: IT Data Engineering Platform Engineer

Location: Dairy Ashford, Houston, TX (Hybrid)

Employment Type: Contract (Flex) 8 months with potential for extension

Job Summary

Shell is seeking an experienced Platform Engineer Data Engineering to support and govern a shared Databricks analytics platform. This role focuses on platform administration, data governance, security, cost management, and operational stability, enabling multiple analytics and AI teams to build solutions safely, efficiently, and consistently.

The ideal candidate will act as the platform owner for guardrails and standards, ensuring scalable, compliant, and cost effective use of Databricks across business teams.

Key Responsibilities

  • Platform Administration & Governance
  • Administer Databricks workspaces, clusters, jobs, compute policies, Unity Catalog, and environment configurations.
  • Define, implement, and enforce platform guardrails and engineering standards.
  • Establish and maintain RBAC and ABAC access controls to ensure secure and compliant data access.
  • Enforce data ingestion standards, naming conventions, schema rules, Delta Lake design patterns, and data quality expectations.

Data Quality & Ingestion Standards

  • Define platform-wide standards for ingestion pipelines, Delta architecture, lineage, versioning, and validation.
  • Review and approve data pipelines to ensure compliance with platform requirements.
  • Partner with data engineering teams to promote best practices and consistency.

Security, Compliance & Access Controls

  • Manage workspace and catalog permissions, including row- and column-level security and attribute-based access policies.
  • Collaborate with security and compliance teams to enforce enterprise data protection standards.

Cost Management & Monitoring

  • Implement cost controls, thresholds, alerts, and compute policies to prevent overspend.
  • Monitor job and cluster usage, identify anomalies, and recommend optimization strategies.
  • Provide transparency into workspace-level and SKU-level cost trends.

Operational Stability & Observability

  • Ensure platform reliability using automated testing, CI/CD templates, and code governance practices.
  • Build dashboards to monitor data access, pipeline health, schema drift, compliance, and cost thresholds.
  • Resolve platform incidents and implement preventative controls.

Enablement & Best Practices

  • Define and document standardized patterns ("handrails") for ingestion, Delta Lake usage, CI/CD, observability, and AI/ML workloads.
  • Support and coach analytics and data engineering teams on compliant onboarding and optimal platform usage.
  • Maintain internal documentation, reusable templates, and engineering guidelines.

Required Qualifications

  • 5+ years of experience in data engineering or platform engineering, including 2 3+ years administering Databricks.
  • Strong expertise in Databricks platform administration, including Unity Catalog, cluster policies, jobs, Spark, and Delta Lake.
  • Solid understanding of data governance, ingestion frameworks, schema enforcement, versioning, and data lineage.
  • Proven experience implementing RBAC and ABAC access controls.
  • Experience with cost optimization, usage monitoring, and compute governance.
  • Proficiency in Python/PySpark and SQL.
  • Familiarity with Databricks Workflows, Delta Live Tables (DLT), Airflow, or similar orchestration tools.
  • Strong communication skills with the ability to influence and guide cross functional teams.

Preferred Qualifications

  • Experience supporting large-scale enterprise data platforms in Azure, AWS, or Google Cloud Platform.
  • Exposure to trading, supply chain, or other high impact analytical environments.
  • Experience building dashboards for governance, cost management, compliance, and pipeline health.
  • Experience with CI/CD tools such as GitHub Actions, Azure DevOps, or similar.

Success Metrics

  • Consistent adoption of ingestion and data standards across teams.
  • Predictable and controlled platform costs.
  • Strong utilization of platform guardrails, templates, and dashboards.
  • Reduced incidents related to access control, data quality, or cost overruns.

Skills & Requirements

Technical Skills

DatabricksUnity catalogCluster policiesJobsSparkDelta lakeData governanceSecurityCost managementOperational stabilityObservabilityRbacAbacData ingestionDelta architectureLineageVersioningData qualityData protectionCost controlsThresholdsAlertsCompute policiesAutomated testingCi/cd templatesCode governanceDashboardsSchema driftCompliancePythonPysparkSqlDatabricks workflowsDelta live tablesAirflowGithub actionsAzure devopsLeadershipCommunicationProblem-solvingTeamworkInfluenceGuidanceData engineeringPlatform engineeringDatabricksData governanceSecurityCost managementOperational stabilityObservabilityData ingestionDelta architectureLineageVersioningData qualityData protectionCost controlsThresholdsAlertsCompute policiesAutomated testingCi/cd templatesCode governanceDashboardsSchema driftCompliance

Employment Type

CONTRACT

Level

mid

Posted

5/4/2026

Apply Now

You will be redirected to Aroha Technologies's application portal.

Sign in and we'll score your resume against this role.

Find Similar Jobs

Browse roles in the same category, level, and remote setup.