Panther Lake’s Arc B‑Series iGPU: 12 Xe Cores, Upgraded Ray Tracing, 50%+ Faster Than Xe2, with Xe3P Powering Next‑Gen Arc

Intel has officially unveiled its third‑generation Xe graphics architecture, Xe3, and it’s rolling out first with the Core Ultra 300 “Panther Lake” integrated GPU. Intel is promising big gains generation over generation, with up to a 50% performance uplift in peak scenarios, stronger efficiency, and meaningful advances in ray tracing, AI throughput, and media features. Intel also teased a follow‑on, Xe3P, as the next step in its roadmap.

A quick look back explains the momentum. Xe2 powered two successful launches: the Lunar Lake Core Ultra 200 iGPU and the Arc Battlemage B‑Series discrete GPUs. Alongside the hardware, Intel’s rapid driver and software progress has improved gaming, content creation, and AI performance, with Arc Pro also supported under the same driver branch. That foundation sets the stage for Xe3.

Branding and roadmap
– Xe3 iGPUs in Panther Lake will carry the Arc B‑Series branding to unify naming across integrated and discrete offerings, even though the B‑Series discrete cards are based on Xe2.
– Xe3P, an enhanced evolution of Xe3, is already in the pipeline. It’s positioned as a significant step forward and will debut with the next Arc family rather than the current B‑Series. Intel hints Xe3P could target either a discrete GPU or a higher‑end iGPU configuration in future platforms.

Bigger, faster, and more scalable
The hallmark change in Xe3 is a larger, more throughput‑oriented design. Each render slice now scales from 4 to 6 Xe cores and from 4 to 6 ray tracing units, a 50% per‑slice increase. This scaling lets Intel mix and match GPU tile configurations inside Panther Lake SoCs.

Panther Lake iGPU configurations
4 Xe core configuration (available on 8C and 16C dies)
– Process: 8C on Intel 3, 16C on TSMC N3E
– 4 Xe cores (Xe3)
– 1 render slice
– 32 XMX engines
– 4 MB L2 cache
– 1 geometry pipeline
– 4 samplers
– 4 ray tracing units
– 2 pixel backends

12 Xe core configuration (top 16C die on TSMC N3E)
– 12 Xe cores (Xe3)
– 2 render slices
– 96 XMX engines
– 16 MB L2 cache
– 2 geometry pipelines
– 12 samplers
– 12 ray tracing units
– 4 pixel backends

Cache is a key differentiator. The 4 Xe setup halves the L2 from Lunar Lake’s 8 MB to 4 MB, but the 12 Xe tier doubles it to 16 MB. Intel says the larger cache reduces SoC fabric traffic by up to 36% in gaming, averaging a 25% drop—good news for latency and power.

Inside the Xe3 core
– Eight 512‑bit Vector Engines (XVE)
– Eight 2048‑bit XMX engines for AI
– Approximately 33% more shared L1/SLM
– Vector Engine improvements: up to 25% more threads, variable register allocation, native SIMD16 ALUs, 3‑way co‑issue, extended math and FP64, plus FP8 dequantization support and Xe matrix extensions

AI performance scales with XMX count. Intel quotes up to 120 TOPS for the 12 Xe configuration and up to 40 TOPS for the 4 Xe variant. Extrapolating, an 8 Xe Xe3 design would land around 67 TOPS, around 25% higher than an 8 Xe Xe2.

Per Xe‑core ops/clock (XMX)
– TF32: 1024 ops/clk
– FP16: 2048 ops/clk
– BF16: 2048 ops/clk
– INT8: 4096 ops/clk
– INT4: 8192 ops/clk
– INT2: 8192 ops/clk

Ray tracing and graphics pipeline upgrades
Xe3 introduces an enhanced RT unit with dynamic ray management for asynchronous ray tracing. Multiple traversal pipelines, two triangle intersection units, and a BVH cache work in concert to keep rays moving smoothly through the pipeline. By more carefully pacing the dispatch of new rays, Xe3 reduces pipeline back‑ups at the thread sorting stage.

A new URB manager allows partial updates rather than full flushes, improving efficiency in complex scenes. Intel also cites up to 2x higher anisotropic filtering rates and up to 2x higher stencil test rates.

Media and display features
– AV1 encode/decode
– VVC (H.266) decode
– AVC 10‑bit support
– Sony XAVC‑H, XAVC‑HS, and XAVC‑S support
– eDP 1.5 support

Performance and efficiency
Early metrics suggest:
– FP16 GEMM up 50%, in line with the 50% architectural scale‑up
– 2x to 2.7x gains in anisotropic rate, mesh render rate, scattered reads, and ray/triangle intersection throughput
– Up to 7x improvements in depth testing and in register‑heavy scenarios
– More than 50% performance over Lunar Lake at peak power on like‑for‑like tests
– More than 40% higher performance per watt versus Arrow Lake‑H

Software stack enhancements
Intel is layering in compiler and runtime features to extract more from the hardware:
– IGC compiler updates with improved variable register allocation
– Faster scheduling with direct preemption, enabling context switches without flushing
– Support for DirectX Cooperative Vectors, with demos including Neural Radiance Field workloads

Where this lands in the market
Xe2 already put Intel in contention with today’s fastest integrated GPUs in mainstream notebooks. Xe3 pushes further, especially in AI and ray tracing, while improving power efficiency. While it won’t replace large, high‑power discrete‑class iGPUs found in specialized chips, Intel’s broader strategy—including custom partnerships—suggests those upper tiers are being addressed separately.

Bottom line
Xe3 is a substantial step forward for Intel’s integrated graphics: larger slices, stronger AI engines, smarter ray tracing, bigger caches, and a tighter software stack. Paired with Panther Lake, it targets higher frame rates, better efficiency, and expanded media capabilities for next‑gen laptops. With Xe3P on the horizon for the next Arc family, Intel’s GPU roadmap looks primed for steady, iterative gains across both integrated and discrete segments.

Panther Lake’s Arc B‑Series iGPU: 12 Xe Cores, Upgraded Ray Tracing, 50%+ Faster Than Xe2, with Xe3P Powering Next‑Gen Arc

Share this:

Related Posts: