Description
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
AMD AI Framework is seeking a Senior Software Developer to be part of the Transformer Engine (TE), a high-performance library designed to accelerate Transformer model training using low-precision arithmetic and custom GPU kernels on MI GPUs and play a key role in enhancing the Megatron-LM (ROCm) framework through fused operations enabling scalable LLM training. As part of a highly skilled team, you'll contribute to cutting-edge deep learning infrastructure and integrate performance-critical components into both client products and the open-source ecosystem.
THE PERSON:
The ideal candidate should be passionate about software engineering and possess leadership skills to drive sophisticated issues to resolution. Able to communicate effectively and work optimally with different teams across AMD.
KEY RESPONSIBILITIES:
- Library Optimization: Optimize open-source deep learning libraries, including Megatron and Transformer Engine, for peak performance on AMD GPUs.
- Model Performance Scaling: Analyze and optimize deep learning models for AMD GPUs across both multi-GPU (scale-up) and multi-node (scale-out) systems.
- Engineering Best Practices: Apply modern software engineering practices while staying current with advancements in hardware, algorithms, and system architecture.
- Hardware Enablement: Contribute to the bring-up and development of new AMD ASICs and GPU hardware platforms.
- Data-Driven Optimization: Use performance data and profiling insights to drive optimizations and influence AMD's deep learning technology roadmap.
- Debugging and Innovation: Debug existing systems and explore more efficient alternatives to improve performance and maintainability.
- Collaboration and Partnerships: Work closely with internal GPU library teams and external partners to optimize training workloads through technical collaboration.
PREFERRED EXPERIENCE:
- Programming Languages & Software Development: Proficient in C/C++ and Python, with experience in software design, debugging, performance analysis, and test development.
- Object-Oriented Design: Solid foundation in object-oriented programming with a focus on writing clean, efficient, and maintainable code.
- Concurrent and Multithreaded Programming: Experience in modern concurrency models and threading APIs for high-performance computing.
- GPU and Parallel Computing: Familiar with GPU programming using HIP, CUDA, and OpenCL, with a foundational understanding of deep learning.
- Deep Learning Optimization: Experience analyzing deep learning workloads with an emphasis on maximizing throughput and performance. (a plus)
- Numerical Computing: Understanding of floating-point arithmetic and its impact on accuracy and precision in scientific computations. (a plus)
- Development Tools and Processes: Experienced with GitHub, CI/CD workflows, and debugging/profiling tools in Linux-based development environments.
- Team Collaboration and Communication: Strong problem-solving and communication skills, with a proven ability to work effectively in collaborative team settings.
ACADEMIC CREDENTIALS:
- Master's and/or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
#LI-JG1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's “Responsible AI Policy” is available here.
This posting is for an existing vacancy.
Apply on company website