As AI technology moves from the early "training" stage of large language models (LLMs) to the "inference" stage that provides practical services to millions of end users, the underlying hardware architecture is also undergoing a major paradigm shift.
Just as SanDisk and SK Hynix officially announced that they would establish a dedicated working group through the Open Compute Project (OCP) to jointly promote the global standardization of next-generation memory solutions—HBF High Bandwidth Flash Memory—this move is not only a technological alliance between two major companies, but also a solution proposed to address the current bottlenecks in AI hardware development.
AI-driven market pain points: the contradictory demand for both memory shortages and high speed and energy efficiency.
The explosive growth of the AI industry in recent years has directly led to a severe global shortage of high-end memory. During the AI training phase, systems need to process massive amounts of data, making HBM (High Bandwidth Memory), with its extremely high data transfer bandwidth, a hot commodity in the market. However, as the market's focus gradually shifts towards "inference," the rules of the game change.
The characteristics of inference workloads are "persistence" and "ubiquity". When AI models operate around the clock on millions of devices, constantly generating answers and predictions, the system not only needs extremely high data transmission bandwidth to ensure responsiveness, but also faces harsh real-world challenges: power consumption limitations and thermal management.
Traditional volatile memory (DRAM) architecture consumes enormous amounts of power and generates high temperatures while maintaining massive capacity for inference applications. This not only increases the operating costs of data centers but also becomes a significant hurdle for edge computing devices. The industry urgently needs a new type of memory that can provide high bandwidth while achieving large capacity and low power consumption.
From HBM to HBF: A Strategic Layout to Complete the Last Mile of AI Inference
If HBM was designed to overcome the computational limitations of AI "brains" during intensive training, then HBF emerged to support the sustained endurance of AI "neural networks" during extensive inference. From the current shift from HBM to HBF design, we can see several clear advantages and underlying technological logic:
• The inherent advantages of non-volatile and large capacity:HBF is based on NAND Flash architecture, which inherently has the advantage of large storage capacity. Compared with HBM (DRAM architecture), HBF can provide industry-leading memory capacity at the same or even lower cost, which is crucial for inference tasks that require loading large AI models.
• Low power consumption and thermal stability:The non-volatile nature of flash memory means it doesn't need continuous power to maintain data like DRAM, significantly reducing overall power consumption. At the same time, its better thermal stability makes it more suitable for large-scale, high-density AI server deployments, reducing the enormous burden on cooling systems.
• High bandwidth specifically designed for inference:In the past, the pain point of NAND Flash was that its transmission speed was far inferior to that of DRAM. However, as the name suggests, HBF breaks through the bandwidth ceiling of traditional flash memory through new interface standards and packaging technologies (such as the integration of SoC and advanced packaging), making it sufficient to meet the real-time data throughput requirements of AI inference.
Market competition considerations behind the alliance
From a market competition perspective, SK Hynix already holds a leading position in the HBM market, while SanDisk possesses deep expertise in NAND Flash design. The merger of these two companies is clearly aimed at gaining a voice in the next wave of AI-driven inference opportunities.
As SanDisk CTO Alper Ilkbahar stated, they are not merely setting new standards, but rather setting the bar for the next era of AI computing. The goal of promoting HBF standardization through the Open Computing Platform (OCP) is to quickly attract server vendors and chip designers (such as NVIDIA, AMD, and Intel) to join this ecosystem. Once HBF becomes the industry-recognized "inference-dedicated memory standard," it will effectively prevent competitors (such as Samsung and Micron) from establishing their own independent presence in this field, ensuring that both companies reap the enormous benefits of AI edge computing and large-scale inference servers in the coming years.
In summary, the birth and standardization of HBF symbolize that the development of AI hardware has officially entered a refined implementation stage that "takes into account energy efficiency, capacity and cost" from the extensive stage of "pursuing ultimate computing power at all costs".



