{"schemaVersion":"jobsearcher.job.v1","id":"0a2ec0e79e364d769b738eb3","url":"https://jobsearcher.com/jobs/0a2ec0e79e364d769b738eb3","canonicalUrl":"https://jobsearcher.com/jobs/0a2ec0e79e364d769b738eb3","title":"Failure Analysis Engineering Manager, GPU ASIC and PCBA Debug","description":"AMD GPU ASIC AND PCBA DEBUG AND FAILURE ANALYSIS ENGINEERING MANAGERAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.The Role:The Quality Engineering team is looking for an experienced GPU ASIC and PCBA Debug and Failure Analysis Engineering Manager to lead and develop a team of FA engineers. This role is intended for a proven people manager with prior experience building, mentoring, and guiding high-performing engineering teams, while also serving as a strong technical lead in GPU ASIC and board-level (PCBA) failure analysis. The individual will oversee customer and factory failure investigations for GPU accelerators, help drive failure reproduction and isolation, and work closely with cross-functional teams including design, validation, FW, and manufacturing to accelerate root cause analysis and corrective actions. Your contributions will directly impact team effectiveness, product quality, reliability, and customer satisfaction.The Person:The ideal candidate is a strong people leader and technical expert who leads by example and is passionate about building, teaching, and mentoring a growing team of high-performing FA engineers. They bring prior experience managing, hiring, and developing engineers, creating an environment of accountability, collaboration, and continuous learning, while remaining hands-on enough to guide complex debug and failure analysis efforts in a fast-paced time to market environment. This person is a clear communicator, and a trusted technical leader who can elevate team capability, help others grow in their careers, and drive strong execution in a fast-paced environment. They combine deep analytical problem-solving skills with a practical, hands-on approach, and continuously look for ways to improve team effectiveness, technical depth, and overall quality outcomes.Key Responsibilities:Provide technical leadership for triage and debug of complex GPU and PCBA failures across power, ASIC, firmware, and thermals, guiding the FA team to root cause.Lead failure reproduction and triage by defining debug plans, directing investigations, and guiding experiments and escalation paths for complex issues.Drive debug automation, diagnostic tools, and data analysis methods that improve triage efficiency and consistency across failure domains.Lead cross-functional triage with manufacturing partners and AMD teams to align on failure hypotheses, reproduction, and root cause.Guide board-level debug using schematics, layouts, and design documentation to direct analysis and mentor engineers through the process.Ensure clear documentation of failure analysis results, root cause findings, and corrective actions for customer and internal use.Present technical findings, triage updates, risks, and recovery plans to stakeholders and senior leadership.Drive continuous improvement of FA methods, triage processes, and best practices across power, ASIC, firmware, and thermal debug.Manage and develop a team of FA engineers by setting priorities, providing technical guidance, and coaching through complex investigations.Preferred Experience:Experience leading and developing engineering teams, with a strong track record of hiring, coaching, mentoring, and growing FA engineers.Deep expertise in GPU ASIC debug, validation, and functional or stress test development.Strong background in PCBA diagnostics, failure analysis, and board-level debug from NPI through production.Experience leading triage across power, ASIC, firmware, and thermal failure domains.Strong hands-on lab experience with oscilloscopes, logic analyzers, and custom debug tools.Solid understanding of firmware, drivers, and hardware interactions in complex system debug.Extensive experience in hardware verification, system integration, and failure reproduction.Proficient in Python, shell scripting, and working across Windows and Linux environments.Strong leadership, communication, and presentation skills, with the ability to teach, mentor, and lead by example.Able to read schematics, interpret datasheets, identify components, and support board-level debug and rework.Knowledge of high-speed digital design, HBM or GDDR memory, PCIe, and GPU data center systems is a plus.Academic Credentials:Bachelor's degree in Electrical Engineering, Computer Engineering, or a related field.3+ years of experience management experienceLocation:Secaucus, NJThis role is not eligible for visa sponsorship.","company":"Advanced Micro Devices","rawCompany":"advanced micro devices","city":"Secaucus","state":"NJ","isRemote":false,"isActive":false,"createdAt":"2026-06-08T01:04:18.737Z","occupations":[{"code":"11-9041.00","title":"Architectural and Engineering Managers","slug":"architectural-and-engineering-managers"},{"code":"11-3051.01","title":"Quality Control Systems Managers","slug":"quality-control-systems-managers"},{"code":"11-3021.00","title":"Computer and Information Systems Managers","slug":"computer-and-information-systems-managers"}],"industries":[{"code":"334111","title":"Electronic Computer Manufacturing","slug":"electronic-computer-manufacturing"},{"code":"334413","title":"Semiconductor and Related Device Manufacturing","slug":"semiconductor-and-related-device-manufacturing"},{"code":"334118","title":"Computer Terminal and Other Computer Peripheral Equipment Manufacturing","slug":"computer-terminal-and-other-computer-peripheral-equipment-manufacturing"}],"jobPosting":{"@context":"https://schema.org","@type":"JobPosting","title":"Failure Analysis Engineering Manager, GPU ASIC and PCBA Debug","description":"AMD GPU ASIC AND PCBA DEBUG AND FAILURE ANALYSIS ENGINEERING MANAGERAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.The Role:The Quality Engineering team is looking for an experienced GPU ASIC and PCBA Debug and Failure Analysis Engineering Manager to lead and develop a team of FA engineers. This role is intended for a proven people manager with prior experience building, mentoring, and guiding high-performing engineering teams, while also serving as a strong technical lead in GPU ASIC and board-level (PCBA) failure analysis. The individual will oversee customer and factory failure investigations for GPU accelerators, help drive failure reproduction and isolation, and work closely with cross-functional teams including design, validation, FW, and manufacturing to accelerate root cause analysis and corrective actions. Your contributions will directly impact team effectiveness, product quality, reliability, and customer satisfaction.The Person:The ideal candidate is a strong people leader and technical expert who leads by example and is passionate about building, teaching, and mentoring a growing team of high-performing FA engineers. They bring prior experience managing, hiring, and developing engineers, creating an environment of accountability, collaboration, and continuous learning, while remaining hands-on enough to guide complex debug and failure analysis efforts in a fast-paced time to market environment. This person is a clear communicator, and a trusted technical leader who can elevate team capability, help others grow in their careers, and drive strong execution in a fast-paced environment. They combine deep analytical problem-solving skills with a practical, hands-on approach, and continuously look for ways to improve team effectiveness, technical depth, and overall quality outcomes.Key Responsibilities:Provide technical leadership for triage and debug of complex GPU and PCBA failures across power, ASIC, firmware, and thermals, guiding the FA team to root cause.Lead failure reproduction and triage by defining debug plans, directing investigations, and guiding experiments and escalation paths for complex issues.Drive debug automation, diagnostic tools, and data analysis methods that improve triage efficiency and consistency across failure domains.Lead cross-functional triage with manufacturing partners and AMD teams to align on failure hypotheses, reproduction, and root cause.Guide board-level debug using schematics, layouts, and design documentation to direct analysis and mentor engineers through the process.Ensure clear documentation of failure analysis results, root cause findings, and corrective actions for customer and internal use.Present technical findings, triage updates, risks, and recovery plans to stakeholders and senior leadership.Drive continuous improvement of FA methods, triage processes, and best practices across power, ASIC, firmware, and thermal debug.Manage and develop a team of FA engineers by setting priorities, providing technical guidance, and coaching through complex investigations.Preferred Experience:Experience leading and developing engineering teams, with a strong track record of hiring, coaching, mentoring, and growing FA engineers.Deep expertise in GPU ASIC debug, validation, and functional or stress test development.Strong background in PCBA diagnostics, failure analysis, and board-level debug from NPI through production.Experience leading triage across power, ASIC, firmware, and thermal failure domains.Strong hands-on lab experience with oscilloscopes, logic analyzers, and custom debug tools.Solid understanding of firmware, drivers, and hardware interactions in complex system debug.Extensive experience in hardware verification, system integration, and failure reproduction.Proficient in Python, shell scripting, and working across Windows and Linux environments.Strong leadership, communication, and presentation skills, with the ability to teach, mentor, and lead by example.Able to read schematics, interpret datasheets, identify components, and support board-level debug and rework.Knowledge of high-speed digital design, HBM or GDDR memory, PCIe, and GPU data center systems is a plus.Academic Credentials:Bachelor's degree in Electrical Engineering, Computer Engineering, or a related field.3+ years of experience management experienceLocation:Secaucus, NJThis role is not eligible for visa sponsorship.","datePosted":"2026-06-08T01:04:18.737Z","dateModified":"2026-06-08T01:04:18.737Z","hiringOrganization":{"@type":"Organization","name":"Advanced Micro Devices","sameAs":"https://jobsearcher.com"},"jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Secaucus","addressRegion":"NJ","addressCountry":"US"}},"identifier":{"@type":"PropertyValue","name":"JobSearcher","value":"0a2ec0e79e364d769b738eb3"},"url":"https://jobsearcher.com/jobs/0a2ec0e79e364d769b738eb3"}}