HPC Support Engineer
HPC Support EngineerLocation: San Jose,CAEmployment Type: 6-12 Months ContractKey ResponsibilitiesCustomer Support & EngagementAct as the primary technical point of contact for customers using cloud-based platformsProvide timely, high-quality support and ensure clear communication throughout issue resolutionBuild strong customer relationships through professionalism, responsiveness, and accountabilityMinimize downtime and proactively address customer needsPlatform & System SupportTroubleshoot and maintain Linux-based cloud environmentsEnsure system availability, performance, and reliability across multi-tenant platformsDiagnose issues across compute, storage, networking, and identity layersHPC & Performance ManagementMonitor HPC clusters, job queues, and system performanceResolve performance bottlenecks including scheduling, memory, and I/O issuesManage and troubleshoot license usage (FlexNet/FLEXlm or similar)Support job schedulers such as Slurm, LSF, or SGEAutomation & OptimizationDevelop automation scripts to streamline operations (monitoring, provisioning, alerting)Use Python, shell scripting, Perl, or similar tools to improve efficiencyLeverage AI-driven automation to reduce manual effort and improve resolution timesSecurity & ComplianceSupport systems handling ITAR and CUI data with strict complianceFollow security, access control, and change management processesParticipate in incident response and root cause analysisMaintain documentation including runbooks and knowledge base articlesMust-Have SkillsStrong Linux system administration and troubleshootingExperience with HPC environments (Slurm, LSF, SGE, or similar)Proficiency in Python, Bash/Shell scripting, or PerlExperience with monitoring, logging, and alerting toolsKnowledge of license management systems (FlexNet/FLEXlm)Experience supporting cloud environments (AWS, Azure, or GCP)Strong customer support and communication skillsRequired QualificationsProven experience in technical support, cloud operations, or infrastructure rolesAbility to manage multiple issues while maintaining high service qualityStrong troubleshooting and problem-solving skillsEligibility to work with export-controlled environments (ITAR/CUI)Preferred QualificationsExperience in EDA, semiconductor, or silicon design environmentsFamiliarity with cloud-based HPC platformsExperience with AI-driven or automated operationsKnowledge of Infrastructure as Code tools (Ansible, Terraform)EducationBachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience)