Company OverviewKLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world's leading technology providers to accelerate the delivery of tomorrow's electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.
Job Description/Preferred Qualifications
HPC/AI Software Infrastructure Leads are core to KLA's technology, while we do not currently have an opening, we are always building our HPC/AI Software Infrastructure Lead Engineering talent community, we are interested in learning about your background.
Apply to this posting for Future Opportunities with KLA.
At KLA, we're pushing the boundaries of semiconductor inspection through advanced AI and high-performance computing. We are looking for a hands-on technical leader to architect and scale the next generation of AI/HPC infrastructure powering our most critical imaging and data platforms. This role is ideal for someone who thrives at the intersection of distributed systems, GPU computing, and real-world AI workloads, and who enjoys building and mentoring high-performing engineering teams while driving technical excellence.
What You'll Do
Lead the architecture and development of large-scale HPC and AI infrastructure supporting cutting-edge image processing and machine learning workloads
Design scalable, high-performance distributed systems that unify traditional image processing with modern AI/Deep Learning pipelines
Drive GPU-accelerated computing strategies, optimizing performance across compute, storage, and networking layers
Partner cross-functionally with hardware, algorithms, and product teams to deliver robust, production-ready platforms
Establish engineering best practices (code quality, CI/CD, observability, performance tuning) for mission-critical systems
Mentor and develop engineers, providing technical guidance, coaching, and growth opportunities for junior team members
Serve as a technical leader and decision-maker, influencing architecture and long-term platform strategy
What You Bring
Experience
10+ years in software engineering, including leading and scaling technical teams
Proven success building distributed systems in HPC, AI/ML, or cloud-native environments
Track record of delivering performance-critical infrastructure at scale
Experience mentoring and growing early- and mid-career engineers
Technical Expertise
Deep understanding of distributed systems, parallel computing, and Linux systems programming
Strong programming skills in C++, Python, or similar systems-level languages
Experience with GPU computing (CUDA, ROCm) and modern AI frameworks (PyTorch, TensorFlow, etc.)
Familiarity with high-performance storage systems, networking, and data pipelines
Strong foundation in CI/CD, DevOps, and production system reliability
Bonus Experience
Background in image processing, computer vision, or scientific computing
Experience supporting hybrid HPC + AI workloads in production environments
Leadership & Impact
Passion for developing talent and building inclusive, high-performing teams
Ability to operate as both a hands-on engineer and strategic technical leader
Strong communication skills with the ability to influence across engineering and product stakeholders
Why KLA / Why Ann Arbor
Work on real-world AI systems at scale, not just experiments
Collaborate across hardware, software, and algorithm teams in a deeply technical environment
Join a growing engineering presence in Ann Arbor, with access to top talent and a strong technical community
Opportunity to shape the direction of AI infrastructure in a core product domain
Minimum Qualifications
Doctorate (Academic) Degree and related work experience of 5 years; Master's Level Degree and related work experience of 8 years; Bachelor's Level Degree and related work experience of 12 years
Base Pay Range: $151,100.00 - $256,900.00
Primary Location: USA-MI-Ann Arbor-KLA
KLA's total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.
Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills,... For full info follow application link.
KLA-Tencor is an Equal Opportunity Employer. Applicants will be considered for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, or any other characteristics protected by applicable law.