What is EverOps?
Some of the world's most advanced and innovative global enterprise software and tech companies have trouble finding engineering partners that have the ability to perform highly complex deliveries and services that match their rigorous standards. These teams need a partner that can co-own problems from within their own development environment. Enter EverOps - the premier Embedded Service Provider. We partner directly with our customer engineering and operations teams to help them assess and address a variety of delivery and service related issues in the DevOps space.
The Challenge
EverOps is looking for a Senior DevOps Data Engineer with deep expertise in data platform architecture, disaster recovery design, and infrastructure-level data operations. This role is not about data analytics or content-it's about building and operating the infrastructure that makes data systems reliable, resilient, and scalable. You'll own the architectural decisions around data platform availability, cutover workflows, replication topologies, and backup/restore strategies across enterprise cloud environments.
The Mission
As a DevOps Data Engineer at EverOps you will join our U.S.-Based Virtual Operating Center (your home office), working with a team of dynamic engineers to architect and operate data infrastructure across multiple customers' production cloud environments. You'll bring a data architect's lens to DevOps-designing DR strategies, planning database migrations and cutovers, and ensuring data platform resilience at scale. Our existing team of engineers has a deep understanding of our customer environments and are eager to empower, ramp up, and mentor each new hire so that success is achieved.
What You'll Do
Design, implement, and validate disaster recovery architectures for relational, NoSQL, and managed data services across AWS, Azure, or GCP
Plan and execute database migration cutovers including blue-green database swaps, read-replica promotion, and zero-downtime schema migration workflows
Architect replication topologies (cross-region, cross-account, active-passive, active-active) and validate RPO/RTO targets through runbook-driven DR drills
Build and maintain Infrastructure as Code for data platform provisioning (RDS, Aurora, DynamoDB, ElastiCache, Redshift, managed Kafka/MSK, etc.) using Terraform, Atlantis, and/or CloudFormation
Design backup, snapshot, and point-in-time recovery strategies with automated validation and alerting
Develop automation tooling for data platform operations: failover orchestration, health checks, capacity scaling, and credential rotation
Implement observability for data infrastructure-replication lag monitoring, connection pool health, query performance baselines, and storage growth forecasting
Support production workload migrations including data tier cutovers with rollback plans and data integrity verification
Contribute to multi-tenant Kubernetes platform operations where data services intersect (e.g., External Secrets Operator for DB credentials, sidecar patterns for connection pooling)
Participate in regular customer and internal EverOps scrums, providing data architecture guidance and operational status
Document runbooks, architecture decision records (ADRs), and operational playbooks for data platform operations
You Have
5+ years of professional experience as a DevOps Engineer, Data Platform Engineer, Database Reliability Engineer, or Site Reliability Engineer with a data infrastructure focus
Deep hands-on experience designing and operating disaster recovery architectures for production databases (failover, replication, backup/restore, cross-region DR)
Production experience planning and executing database cutover workflows-blue-green database swaps, read-replica promotions, DMS-based migrations, and zero-downtime schema changes
Strong experience with AWS managed data services: RDS/Aurora (Multi-AZ, Global Database, cross-region replicas), DynamoDB (Global Tables, PITR, on-demand backup), ElastiCache, Redshift, and/or MSK
Hands-on experience with Infrastructure as Code (Terraform + Atlantis and/or CloudFormation) for data platform provisioning and lifecycle management
Hands-on experience and deep understanding of Linux
Strong professional experience with at least one of: Python, Golang, Bash, or Rust for automation and tooling
Production experience with Amazon EKS including understanding of how data workloads intersect with Kubernetes (StatefulSets, PVCs, External Secrets Operator, connection pooling)
Experience with HashiCorp Vault for secrets management, particularly database credential rotation and dynamic secrets
Understanding of GitOps workflows, repository structures, and governance patterns
Experience with CI/CD tools like Jenkins, GitHub Actions, ArgoCD, etc.
Experience with monitoring tools such as Datadog, Splunk, ELK, or Prometheus/Grafana-specifically for data infrastructure observability (replication lag, connection health
FULL TIME
senior
4/16/2026
You will be redirected to EverOps's application portal.