System Debug Engineer, Cloud AI Infrastructure

Google
Austin, US
On-site

Job Description

In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include:

  • Health, dental, vision, life, disability insurance
  • Retirement Benefits: 401(k) with company match
  • Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
  • Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance
  • Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
  • Baby Bonding Leave: 18 weeks
  • Holidays: 13 paid days per year

Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Kirkland, WA, USA; Austin, TX, USA.Minimum qualifications:

  • Bachelor's degree in Computer Science or IT-related field, or equivalent practical experience.
  • 5 years of experience with systems automation, and with systems design and implementation.
  • 5 years of experience with technical infrastructure (e.g., deployment, maintenance, troubleshooting), and with reliability of technical infrastructure.

Preferred qualifications:

  • 5 years of experience working with vendors or customers.
  • Ability to work as a subject matter expert for stakeholders to resolve complex AI infrastructure obstacles.

About The Job

Systems Development Engineering (SDE) at Google is a role where you manage services and systems at scale. SDEs creatively put their engineering discipline to use automating the mundane and reducing toil. We don’t just write code to fix bugs, but emphasize the development of tools and solutions that fix classes of problems. We know it’s hard to control what you can’t measure – so we focus on observability: instrumenting first, then turning data into knowledge, and finally knowledge into action. We know that the operational efficiency of Google systems, services, virtual compute environments and the operating systems that power them impact the environment, not just the bottom line. We know that working together we can do more, and that community matters.

Google brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

Together we engineer and build the infrastructure, tools, access and telemetry for systems that enable orchestration of Google-scale services. Come build things that matter.

The Google Cloud Support team ensures customers maximize their investment. As a Systems Debug Engineer, you will be a trusted advisor driving hardware understanding and issue resolution. You will

troubleshoot complex platform challenges, providing expert solutions that enable innovation. By representing the customer, you will collaborate with engineering and product teams to drive continuous improvement across our global cloud products and services.

Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.

The US base salary range for this full-time position is $163,000-$237,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google .

Responsibilities

  • Participate in on-call activities and manage domain systems, collaborating with responders to resolve issues.
  • Resolve customer issues and troubleshoot AI/ML workloads by developing effective diagnostic and investigation tools.
  • Partner with Product, Quality, and SRE teams to improve product quality and production standards.

Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's

Skills & Requirements

Technical Skills

Systems automationSystems design and implementationTechnical infrastructureReliability of technical infrastructureAi infrastructureObservabilityInstrumentingTelemetryGoogle-scale servicesCloud ai infrastructure

Salary

$163,000 - $237,000

year

Employment Type

FULL TIME

Level

senior

Posted

5/6/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.

Sign in and we'll score your resume against this role.

Find Similar Jobs

Browse roles in the same category, level, and remote setup.