GCP Data Engineer

TechDigital Group
Seattle; Washington, US
On-site

Job Description

The GCP Data Engineer will be responsible for constructing and developing large-scale cloud data processing systems within the Google Cloud Platform (GCP). This role involves curating a comprehensive data set that includes information about users, groups, and their permissions to various data sets. The engineer will redesign and implement a scalable data pipeline to ensure timely updates and transparency in data access.

REQUIRED SKILLS

  • 5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
  • 5+ experience working as a Data Engineer in GCP
  • Demonstrated proficiency with Google's Identity and Access Management (IAM) API
  • Demonstrated proficiency with Airflow

Key Responsibilities

  • Design, develop, and implement scalable, high-performance data solutions on GCP.
  • Ensure that changes to data access permissions are reflected in the Tableau dashboard within 24 hours.
  • Collaborate with technical and business users to share and manage data sets across multiple projects.
  • Utilize GCP tools and technologies to optimize data processing and storage.
  • Re-architect the data pipeline that builds the BigQuery dataset used for GCP IAM dashboards to make it more scalable.
  • Run and customize DLP scans.
  • Build bidirectional integrations between GCP and Collibra.
  • Explore and potentially implement Dataplex and custom format-preserving encryption for de-identifying data for developers in lower environments.

Qualifications – Required

  • Bachelor's degree in Computer Engineering or a related field.
  • 5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
  • 5+ years of experience working as a Data Engineer in GCP.
  • Proficiency with Google's Identity and Access Management (IAM) API.
  • Strong Linux/Unix background and hands-on knowledge.
  • Experience with big data technologies such as HDFS, Spark, Impala, and Hive.
  • Experience with Shell scripting and bash.
  • Experience with version control platforms like GitHub.
  • Experience with unit testing code.
  • Experience with development ecosystems including Jenkins, Artifactory, CI/CD, and Terraform.
  • Demonstrated proficiency with Airflow.
  • Ability to advise management on approaches to optimize for data platform success.
  • Ability to effectively communicate highly technical information to various audiences, including management, the user community, and less-experienced staff.
  • Proficiency in multiple programming languages, frameworks, domains, and tools.
  • Coding skills in Scala.
  • Experience with GCP platform development tools such as Pub/Sub, Cloud Storage, Bigtable, BigQuery, Dataflow, Dataproc, and Composer.
  • Knowledge in Hadoop and cloud platforms and surrounding ecosystems.
  • Experience with web services and APIs (RESTful and SOAP).
  • Ability to document designs and concepts.
  • API Orchestration and Choreography for consumer apps.
  • Well-rounded technical expertise in Apache packages and hybrid cloud architectures.
  • Pipeline creation and automation for data acquisition.
  • Metadata extraction pipeline design and creation between raw and transformed datasets.
  • Quality control metrics data collection on data acquisition pipelines.
  • Experience contributing to and leveraging Jira and Confluence.
  • Strong experience working with real-time streaming applications and batch-style large-scale distributed computing applications using tools like Spark, Kafka, Flume, Pub/Sub, and Airflow.
  • Ability to work with different file formats like Avro, Parquet, and JSON.
  • Hands-on experience in Analysis, Design, Coding, and Testing phases of the Software Development Life Cycle (SDLC)

#J-18808-Ljbffr

Skills & Requirements

Technical Skills

PythonJavaSparkSqlGoogle's identity and access management (iam) apiAirflowLinux/unixHdfsImpalaHiveShell scriptingBashGithubUnit testingJenkinsArtifactoryCi/cdTerraformGoogle cloud platform (gcp)Pub/subCloud storageBigtableBigqueryDataflowDataprocComposerHadoopWeb servicesApisRestfulSoapJiraConfluenceReal-time streaming applicationsBatch-style large-scale distributed computing applicationsSparkKafkaFlumePub/subAirflowAvroParquetJsonCommunicationAdvising managementDocumenting designs and conceptsCloud data processingGoogle cloud platform (gcp)IamData access permissionsTableauCollibraDataplexCustom format-preserving encryptionDe-identifying dataDevelopersLower environmentsApi orchestrationChoreographyConsumer appsApache packagesHybrid cloud architecturesPipeline creationAutomationData acquisitionMetadata extractionQuality control metricsFile formats

Employment Type

FULL TIME

Level

senior

Posted

4/30/2026

Apply Now

You will be redirected to TechDigital Group's application portal.

Sign in and we'll score your resume against this role.