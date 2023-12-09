AMD CEO Dr. Lisa Su on stage at the company’s Advancing AI event in San Jose Dave Altavilla

AMD provided an in-depth look at its latest AI accelerator arsenal for data centers and supercomputers, as well as consumer client devices, but software support, optimization, and developer adoption will be key.

Advanced Micro Devices held its Advancing AI event in San Jose this week, and in addition to launching new AI accelerators for data centers, supercomputing, and client laptops, the company unveiled its software and ecosystem enablement strategy with an emphasis on open source accessibility. Also presented. , Market demand for AI compute resources currently exceeds the supply from existing companies like Nvidia, so AMD is rushing to provide attractive alternatives. Underscoring this strongly, AMD CEO Dr. Lisa Su said the company is raising its TAM forecast for AI accelerators from $150 billion estimated at this time a year ago to $400 billion by 2027 with a 70% compound annual growth rate. Still working. , Artificial Intelligence is clearly a huge opportunity for major chip players, but no one can really predict the actual potential market demand. AI will be so transformative that it will impact almost all industries in some way or the other. Regardless, the market will likely welcome and look forward to these new AI silicon engines and devices from AMD.

Instinct MI300X and MI300A: Tip of the AMD AI Spear

AMD Instinct MI300A AI Accelerator for HPC amd

AMD’s Data Center Group this week formally launched two major product family offerings for the enterprise and cloud AI and supercomputing markets, known as the Mi300X and Mi300A, respectively. Both of these products are purpose-built for their respective applications, but are based on the same chiplet-enabled architecture with a blend of advanced 3D packaging technologies and optimized 5 and 6nm semiconductor chip fab processes. AMD’s high-performance computing AI accelerator is the Instinct MI300A which features the company’s CDNA3 data center GPU architecture as well as Zen 4 CPU core chiplets (24 EPYC Geno cores) and 128GB of shared, integrated HBM3 memory that powers both GPUs. There are accelerators. And the CPU cores have access to 256 MB of Infinity Cache. The chip contains 146B transistors and offers peak memory bandwidth of up to 5.3 TB/s, with its CPU, GPU and IO interconnect enabled through AMD’s high-speed serial Infinity Fabric.

AMD claims 4x performance gain over Nvidia H100 in OpenFoam amd

This AMD accelerator can also run as both a PCIE connected add-in device and a root complex host CPU. Overall, the company is making bold claims for the MI300A in HPC, including a 4X performance uplift versus Nvidia’s H100 accelerator in applications like OpenFOAM for computational fluid dynamics and a 2X performance-per-watt uplift versus Nvidia’s GH200 Grace. Is. Hopper Superchip. The AMD MI300A will also power Apache’s El Capitan at Lawrence Livermore National Laboratory, where it will replace Frontier (powered by AMD) as the world’s first two-exaflop supercomputer, reportedly making it the world’s fastest, most powerful Will make a supercomputer.

Handheld AMD Instinct MI300X GPU AI Accelerator – It’s HUGE Dave Altavilla

However, the MI300X is a different kind of beast, targeted at cloud data centers and enterprise AI workloads like large language models, natural language recognition, and generative AI. The MI300X has no Zen 4 CPU chiplets on board (what AMD calls CCDs), though it does accommodate more AMD CDNA 3 accelerator complex die chiplets in an all-GPU design. There are a total of 6 XCDs on the MI300X, totaling 228 GPU compute units. The MI300X also has a larger memory capacity with 192GB of HBM3. Like the MI300A, the MI300X also offers a total memory bandwidth of approximately 5.3TB/s and a massive peak bandwidth of 17TB/s from its 256MB AMD Infinity Cache.

AMD Instinct Mi300X estimating performance claims vs Nvidia H100 amd

Once again the performance claims made by AMD are bold, with Su announcing a 1.4X performance lift (latency reduction) in Llama2 (Meta’s assistant-like natural language model), an alternative to GPT, BLOOM Transformer-based LLM has 1.6X uplift. -3 vs Nvidia’s competing offering. Anticipating such workloads, AMD is claiming performance leadership over Nvidia, although the Mi300X will deliver roughly performance parity with the H100 in AI training workloads. Of course Nvidia has just released an update to its optimized software for Llama2, so it’s possible that AMD hasn’t included it in the above benchmark results. Additionally, Nvidia’s H200 Hopper GPU is waiting in the wings and Nvidia estimates should bring even more gains to performance.

AMD Ryzen 8040 series will bring AI lift to laptops

AMD Ryzen 8040 Series Mobile Processor amd

From a hardware standpoint, the remaining offerings at AMD’s Advancing AI Day included Ryzen AI and a new line of Ryzen 8040 series mobile processors for laptops. Code named, Hawk Point, these APUs are similar to AMD’s current generation Ryzen 7040 series, with eight Zen 4 CPU cores and twelve RDNA 3 compute units for graphics, with the clock speeds also increasing. However, Hawk Point’s neural processing unit has been optimized in both hardware and firmware, and AMD says its new XDNA NPU delivers throughput of up to 16 trillion operations per second for AI workloads, compared to its previous generation. Represents a 60% performance increase. 7040 series.

AMD Ryzen 8040 Series Performance Lift Claimed for Generative AI amd

AMD claims this will lead to up to a 40% increase in real-world AI application performance in this new line of laptops, including AI models like Llama 2 and other machine vision-related applications. Since the Ryzen 8040s XDNA NPU is essentially a piece of Xilinx FPGA, optimizations are likely to be made to this block of circuitry, reconfiguring it for better performance and efficiency. AMD says Ryzen 8040-series AI-enabled PCs will be available in the first quarter of 2024, and it’s now sampling OEM partners.

Software enablement is the key: enter ROCm 6 and Ryzen AI software

All this powerful new silicon will require a much heavier software enablement effort from AMD, and in that regard the company announced two new installments to its software suite for developers, ROCM 6, in conjunction with its Xilinx Vitis AI development. Will work. deployment tools, as well as Ryzen AI software for client machines. AMD says a second installment of ROCM6 for training workloads is also coming. ROCm is AMD’s open-source software development platform, and it supports many major AI frameworks such as ONYX, TensorFlow, and PyTorch. AMD also notes that data center AI developers coming from Nvidia’s CUDA language can easily port and optimize their existing models and applications with ROCM. AMD CEO Dr. Emphasized that Lamini features and performance have been reached. Similarity with CUDA.

AMD Ryzen AI Software Deploy amd

On client machines, Ryzen AI will take pre-trained models and quantize and optimize them to run on AMD’s silicon for easier deployment. In speaking with AMD, I was told the goal is to create a simple one-click interface for developers, with support for ONYX, Tensorflow, and Pytorch live in the first installment of Ryzen AI software right now. The folks at Redmond are also open to Windows support, but AMD will ultimately be at the mercy of Microsoft in this regard.

Concluding this quick-take Advancing AI Day digest, I would offer that AMD’s success will depend largely on its software enablement effort, which will require long-term, sustained investment in ease-of-use, performance, and efficiency optimizations, and Ultimately developer adoption. The company appears to have the hardware muscle ready to take on its primary rivals Nvidia and Intel. With AMD President Victor Peng leading its AI strategy, and having promoted a long series of software enablements at Xilinx prior to the company’s acquisition, it appears that AMD has the time to execute on this side of the equation as well. There are leadership and resources for this. , It’s going to be a dogfight with Nvidia, no doubt about that. With heavy optimization and tuning of models underway now, the AI ​​performance landscape can and will change suddenly. And let’s face it, AI is still in its infancy.