High-Bandwidth Flash (HBF), sometimes referred to as High-Bandwidth Flash DRAM (HBFD), is shaping up as one of the more intriguing memory ideas to emerge from the AI boom. It promises something HBM can’t easily deliver at scale: dramatically higher capacity per stack. But despite the hype, current industry chatter suggests NVIDIA isn’t preparing to adopt HBF anytime soon. Instead, Google is expected to become one of the first major customers as it expands its AI infrastructure.
HBF is being co-developed by SanDisk and SK hynix, and its design borrows a familiar concept from High Bandwidth Memory: stacking layers vertically to boost density and bandwidth in a compact footprint. The difference is what gets stacked. Rather than DRAM dies like HBM, HBF stacks multiple layers of NAND flash, connected through TSVs (Through Silicon Vias) to act more like a unified memory stack than a collection of separate chips.
That architecture is what enables the headline-grabbing capacity claims. Today’s HBM stacks typically land in the 32GB to 64GB range. HBF, on the other hand, is being positioned as capable of scaling as high as 4TB per stack, a massive jump that could meaningfully change how AI systems think about memory limits.
Performance-wise, HBM remains the speed king. However, HBF is being positioned as “fast enough” for certain workloads thanks to architectural optimizations designed to push throughput higher than traditional flash-based storage. The sweet spot being discussed is AI inferencing, especially as interest grows in agentic AI systems that rely on fast access to context. With more capacity close to the compute, HBF could also reduce pressure tied to KV cache limits, which can become a bottleneck in large model deployments.
Even with those advantages, reports indicate NVIDIA is not currently planning to move to HBF. The apparent reasoning is practical: NVIDIA believes its bandwidth and capacity challenges can be met through other routes, particularly enterprise SSD solutions. There’s also talk that NVIDIA is working with Kioxia on next-generation PCIe Gen7 SSDs, aiming for performance gains that could be dramatically higher than many standard designs. In that view, SSD advances could cover much of what HBF is trying to solve—without introducing a new memory standard into NVIDIA’s platform roadmap.
Meanwhile, development of HBF continues to accelerate. SK hynix is said to be leading the push, with first samples expected in the second half of this year. Sampling is a key milestone because it signals real-world validation is approaching, not just early-stage concept work.
If NVIDIA is sitting this one out for now, Google may be stepping in as the early anchor customer. Google’s AI ambitions keep expanding, and its TPU ecosystem is scaling quickly with next-generation TPU platforms reportedly in the pipeline. That combination—aggressive AI buildout plus custom silicon—often creates the perfect environment to adopt emerging memory technologies earlier than the rest of the market.
Beyond the “HBM alternative” narrative, HBF may also find an even broader opportunity: supplementing or replacing traditional server memory in certain designs. Data centers are already experimenting with different memory approaches as system bottlenecks shift. With AI workloads, CPUs can become constraints in surprising ways, and that has increased demand for LPDDR5 and LPDDR5X in servers, including newer form factors like SOCAMM2. HBF’s stacked approach could help reduce PCB space, increase memory capacity, and keep power consumption in check while still offering high bandwidth—an appealing mix for dense AI servers.
For now, the near-term story is clear: HBF is moving toward sampling, Google is expected to be a major early adopter, and NVIDIA appears content to stick with HBM while pushing storage performance forward through next-gen enterprise SSDs. Whether HBF becomes a mainstream pillar of AI hardware—or remains a specialized solution for particular workloads—will likely depend on how quickly it proves itself in real deployments and how urgently the industry needs multi-terabyte, high-bandwidth memory stacks close to compute.






