🖥️ AI & GPU Industry Weekly Recap: March 9–15, 2026

🔑 Key Highlights

NVIDIA and Thinking Machines Lab (founded by former OpenAI CTO Mira Murati) announce a landmark gigawatt-scale strategic partnership, committing to deploy at least 1 GW of Vera Rubin systems for frontier model training — with NVIDIA also making a significant equity investment in the startup.
AMD Ryzen AI NPUs finally become useful on Linux for LLM inference: the open-source Lemonade 10.0 server and FastFlowLM 0.9.35 runtime deliver the first practical NPU-accelerated LLM workloads on Linux for Ryzen AI 300/400 series SoCs, requiring Linux kernel 7.0.
NVIDIA launches Nemotron 3 Super, a 120B-parameter (12B active) MoE model targeting agentic AI workloads, claiming 5x higher throughput and a 1-million-token context window, already adopted by Perplexity, Palantir, Cadence, Siemens, and more.
NVIDIA’s R595 Linux driver series sees rapid Vulkan iteration, shipping three beta releases in one week (595.44.02, 595.45.04, 595.44.03) with major new extensions including VK_KHR_device_address_commands and AV1 encoding fixes for Blackwell GPUs.
China’s Lisuan Tech officially announces its G100 series 6nm GPUs with a June 18 launch date, featuring the gaming-focused LX 7G106 (12GB GDDR6, ~RTX 4060-class) built on a fully in-house “TrueGPU” architecture — a notable milestone in China’s GPU self-sufficiency push.

🤖 AI & Machine Learning

NVIDIA Nemotron 3 Super: Agentic AI at Scale

NVIDIA launched Nemotron 3 Super, a 120B-parameter open-weight model with only 12B active parameters at inference thanks to a hybrid Mixture-of-Experts (MoE) architecture. Key technical innovations include:

Mamba layers delivering 4x memory/compute efficiency alongside transformer reasoning layers
Latent MoE: activates four expert specialists at the cost of one per token
Multi-Token Prediction: 3x faster inference by predicting multiple tokens simultaneously
Runs in NVFP4 precision on Blackwell, achieving 4x faster inference than FP8 on Hopper

The model tops DeepResearch Bench and DeepResearch Bench II leaderboards and is available via NVIDIA NIM microservices, Google Cloud Vertex AI, Oracle Cloud, AWS Bedrock (coming soon), Azure, and partners including CoreWeave, Together AI, and Fireworks AI. Enterprise adopters include Palantir, Cadence, Dassault Systèmes, Siemens, and Amdocs.

NVIDIA–Thinking Machines Lab Partnership

NVIDIA announced a multi-year deal with Thinking Machines Lab (led by Mira Murati) to deploy ≥1 gigawatt of Vera Rubin compute systems starting in early 2027, targeting frontier model training. NVIDIA also made a direct equity investment, signaling a deepening relationship beyond hardware supply.

NVIDIA State of AI 2026 Report

NVIDIA’s annual survey of 3,200+ enterprise respondents across financial services, retail, healthcare, telecom, and manufacturing revealed:

64% of organizations are actively using AI in operations
88% report AI has increased annual revenue; 87% report cost reductions
86% plan to increase AI budgets in 2026; 40% by 10%+
Agentic AI is moving from pilot to production, led by telecom (48% adoption) and retail/CPG (47%)
Open source is critical to AI strategy for 85% of respondents

AMD Ryzen AI NPUs: Linux LLM Breakthrough

After two years of mainline kernel driver development (AMDXDNA), AMD’s Ryzen AI NPUs are finally practical on Linux. Lemonade Server 10.0 + FastFlowLM 0.9.35 deliver:

NPU-accelerated LLM inference and Whisper speech recognition
Support for 256k token context lengths
Native Claude Code integration
Compatibility with Ryzen AI 300 and 400 series SoCs (Strix Point and newer)
Requires Linux 7.0 kernel or AMDXDNA backports

This is particularly significant timing given the launch of the Ryzen AI Embedded P100 and Ryzen AI PRO 400 series targeting enterprise/industrial Linux deployments.

AMD HDR Linux Driver Co-Developed with Claude AI

AMD engineer Harry Wentland disclosed that new DRM color pipeline (drm_colorop CSC) patches for the AMDGPU driver and KDE KWin compositor were substantially co-developed using Claude Sonnet 4.5. The work adds color-space conversion support to the DRM Color Pipeline API introduced in Linux 6.19, with the patches being prepared for upstream submission.

⚡ GPU & Hardware

NVIDIA R595 Linux Driver: Rapid Vulkan Progress

Three Vulkan-focused driver betas shipped within one week: | Driver | Key Features | |——–|————-| | 595.45.04 | First R595 public beta; VK_EXT_descriptor_heap, HDR, DRI3 v1.2 | | 595.44.02 | descriptorHeapCaptureReplay, YCbCr compression, AV1 encoding fix on Blackwell | | 595.44.03 | VK_KHR_device_address_commands (Vulkan 1.4.346), depth/stencil host image copy for Blackwell |

Benchmark testing of the RTX 5090 on a Dell UltraSharp U5226KW 52-inch 6K display showed measurable incremental performance gains over the stable 590 series across OpenGL, Vulkan, and compute workloads.

NVIDIA Path Tracing: 10,000x and Counting

At GDC 2026, NVIDIA Dev & Performance VP John Spitzer presented a roadmap claiming:

Current Blackwell (RTX 50) GPUs are already 10,000x faster at path tracing vs. Pascal (RTX 10), driven by dedicated RT and Tensor cores and AI-based neural rendering
Future Rubin-generation GPUs (targeted 2027–2028) could reach 1,000,000x improvement via continued AI/neural rendering advances
Demo showcased Witcher 4 with over 2 trillion triangles using RTX Mega Geometry and ReSTIR algorithms

AMD Ryzen AI Embedded P100 Series (8–12 Core Launch)

AMD formally launched the higher-tier Ryzen AI Embedded P100 models:

Zen 5 cores (8–12) + RDNA 3.5 iGPU + XDNA2 NPU
TDP: 15–54W; PCIe Gen 4 x16
Rated for 24/7 operation, 10-year lifetime (BGA)
ROCm certified for the RDNA 3.5 iGPU
Models: P164, P174, P185 (and industrial-temp P164i, P174i, P185i)
Silicon production: Q3 2026; Reference boards: H2 2026
Ryzen AI Embedded X100 series (up to 16 Zen 5 cores) arrives H2 2026

Linux 7.1: AMD Ryzen AI NPU Power Reporting

Patches queued in drm-misc-next for Linux 7.1 will add power estimate reporting to the AMDXDNA accelerator driver, giving users visibility into NPU power consumption — an important addition for thermal management in embedded and industrial deployments.

China’s Lisuan G100 GPUs: June 18 Launch

Lisuan Tech set an official launch date of June 18, 2026 for its G100 series, with preorders opening March 17. The gaming flagship LX 7G106 features:

6nm process node, fully in-house “TrueGPU” architecture
12GB GDDR6, 192 TUs, 96 ROPs, 24 TFLOP/s FP32
Benchmarks suggest RTX 4060-class performance
DirectX 12, Vulkan, OpenCL, OpenGL, Windows-on-Arm, and Linux support
Professional variants (LX Max/Pro/Ultra) with up to 24GB ECC GDDR6

🏭 Industry & Market

ABB Robotics + NVIDIA Omniverse: Industrial Physical AI

ABB Robotics and NVIDIA announced a major partnership integrating NVIDIA Omniverse libraries directly into ABB’s RobotStudio platform. The new RobotStudio HyperReality product (launching H2 2026) promises:

99% sim-to-real correlation between virtual and physical robot behavior
Up to 40% reduction in deployment costs; 50% faster time to market
80% reduction in setup/commissioning time
Absolute Accuracy technology reducing positioning error from 8–15mm to ~0.5mm
Early pilots with Foxconn (consumer electronics assembly) and Workr (SME robotics)
ABB exploring NVIDIA Jetson integration into its Omnicore controller

ABB’s RobotStudio is used by over 60,000 robotics engineers worldwide, making this one of the largest industrial AI deployments to date.

NVIDIA + Dassault Systèmes Industrial AI

NVIDIA’s partnership with Dassault Systèmes continued to gain traction, with SIMULIA software now leveraging NVIDIA CUDA-X and AI physics libraries for virtual twin physics simulation. Notable deployments:

Lucid Motors: EV design and powertrain engineering digital twins
Bel Group: Food protein simulation (Babybel, non-dairy options)
Omron: Industrial automation digital twins
Wichita State NIAR: Aircraft design and certification using NVIDIA Nemotron models
PepsiCo: 20% throughput increase, 10–15% CapEx reduction from AI-driven digital twins

NVIDIA CUDA 13.2 + AlmaLinux Official Support

NVIDIA officially extended CUDA 13.2 support to RHEL-compatible distributions including AlmaLinux, with a formal distribution agreement allowing NVIDIA packages to be shipped directly from AlmaLinux repositories. Key benefits include synchronized driver/CUDA update cadence and official NVIDIA support for enterprise Linux downstreams.

NVIDIA GTC 2026: Pre-Conference Momentum

With NVIDIA GTC 2026 running March 16–19 in San Jose (30,000 attendees from 190 countries), the week saw major pre-conference announcements cascade — Nemotron 3 Super, the Thinking Machines partnership, the ABB Robotics deal, and industrial AI showcases — all building toward Jensen Huang’s keynote on March 16 at the SAP Center.

🛠️ Developer Ecosystem

CUDA 13.2 New Features

Beyond the AlmaLinux support, CUDA 13.2 delivers:

Spin-wait dispatch mode for host tasks (reduced execution latency)
New PTX features and host compiler support
Improved C++20 standards conformance in NVCC

AMD ZenDNN 5.2: Major Redesign

AMD ZenDNN 5.2 introduces a fully re-engineered runtime architecture with:

Multiple backend support: native ZenDNN, AOCL-DLP, oneDNN, FBDGEMM, libxsmm
Improved performance, scalability, and full backward compatibility
Significant extensibility improvements over previous versions

AOCC 5.1 (AMD Optimizing C/C++ Compiler) also shipped quietly in January with Zen 5-tuned AOCL-LibM 5.2 math library — though it remains based on the aging LLVM/Clang 17 (September 2023), raising concerns about AMD’s investment in its proprietary compiler toolchain versus contributing upstream to GCC/LLVM.

Lemonade 10.0 + FastFlowLM 0.9.35

The open-source Lemonade server is now the primary vehicle for AMD Ryzen AI NPU LLM inference on Linux. Key developer features:

FastFlowLM NPU-first runtime with up to 256k token context support
Native Claude Code integration
Whisper speech recognition support
Documentation guide available for Ryzen AI 300/400 series setup

EndeavourOS Titan

EndeavourOS Titan shipped with:

Linux 6.19 kernel and Mesa 26.0 graphics drivers
NVIDIA R590 drivers; GNOME-based improvements
New eos-hwtool GPU driver management utility
Automatic GPU/VM hardware detection and Vulkan + media acceleration package installation

Ubuntu 26.04 + GNOME 50: Linux Gaming Performance

Early benchmarks of Ubuntu 26.04 (with GNOME 50 / Mutter 50) on RTX 5080 and RTX 5090 hardware showed incremental gaming performance improvements over Ubuntu 25.10, attributed to NVIDIA-specific Mutter optimizations and updated kernel stack — using the same NVIDIA 590.48.01 driver across both releases.

OpenClaw: NVIDIA’s Agentic AI Open Source Push

NVIDIA is highlighting OpenClaw at GTC 2026 — described as the fastest-growing open-source project — enabling always-on AI agents (“claws”) that run locally on DGX Spark or GeForce hardware, integrating with local files, apps, and calendars without cloud dependency.

📊 Key Takeaways

The week was dominated by the gravitational pull of NVIDIA GTC 2026, with NVIDIA orchestrating a cascade of partnerships (Thinking Machines Lab, ABB Robotics, Dassault Systèmes), model launches (Nemotron 3 Super), and driver releases that collectively paint a picture of NVIDIA cementing its position across every layer of the AI stack — from silicon to software to enterprise deployment. Meanwhile, AMD made a quiet but meaningful breakthrough with Ryzen AI NPU Linux LLM support finally becoming practical via Lemonade 10.0 and FastFlowLM, arriving at a strategically important moment as the Ryzen AI Embedded P100 and PRO 400 series target industrial Linux markets. On the geopolitical hardware front, Lisuan Tech’s G100 GPU launch date announcement is a landmark moment in China’s GPU self-sufficiency narrative, even as its RTX 4060-class performance remains well behind the frontier — the broader trajectory of indigenous Chinese GPU development warrants close monitoring in the months ahead.

News Weekly: 2026-03-09–2026-03-15