As the competition in generative AI models intensifies, the high cost of inference and computational bottlenecks have become major pain points that AI giants are eager to address. According toThe Information Latest ReportIt was pointed out that AI startup Anthropic is planning to purchase chips from a UK-based AI chip startup called Fractile in order to reduce the computational cost of AI model inference.
Fractile claims to be a dark horse with the potential to improve the inference performance of large language model (LLM) inference by up to 100 times that of NVIDIA products, but at only one-tenth the cost. Its chip is expected to be officially launched in 2027.
A game-changer that combines computation and memory into one.
Founded in 2022, Fractile was established by Dr. Walter Goodwin from Oxford University. Although it is a relatively young company, its development team is strong, including industry veterans who have worked at Graphcore, NVIDIA, and Imagination Technologies.
According to Tom's Hardware, Fractile completed a $1500 million seed round in 2024 and is currently seeking a new $2 million funding round, aiming to push the company's valuation to the $10 billion unicorn threshold.
The core weapon lies in its architectural design.
Currently, mainstream NVIDIA GPUs separate their computing units from memory components (such as HBM or DRAM). When performing AI inference, data must be frequently moved between the two, which not only causes annoying latency issues but also leads to huge power consumption. This is the biggest pain point, which is jokingly referred to in the industry as the "Memory Wall".
To break down this "wall," Fractile employs a RISC-V instruction set-based chip architecture and SRAM static random access memory, boldly integrating the "processing unit" and "memory" onto a single die. The company claims that this "near-memory" design, which eliminates data migration bottlenecks, allows it to execute large language model inferences up to 100 times faster than NVIDIA GPUs, while operating at only 10% of the cost.
Anthropic's computing power anxiety and risk diversification strategy
Why would Anthropic be interested in this early-stage chip startup? The answer lies in the computing power anxiety behind its explosive growth.
By the end of 2025, Anthropic's annual revenue had surged to $300 billion. Unlike OpenAI or xAI, which actively build their own data centers, Anthropic's strategy leans more towards "cloud neutrality," widely using third-party computing platforms including NVIDIA GPUs, Amazon's Trainium, and Google TPUs.
With the rapid expansion of its business, inference costs have become a heavy burden. Analysts point out that Anthropic's move is not only to obtain more efficient and cheaper inference computing power, but also to "diversify supply chain risks" and avoid over-reliance on a single NVIDIA architecture in future computing power arms races.
Analysis of viewpoints
Anthropic's reported purchase of Fractile chips reflects a key trend in the current AI market: the hardware demand for "training" and "inference" is rapidly diverging (for example, Google's first purchase of Fractile chips this year).8th generation TPUIt is broken down into the corresponding training TPU 8t and the corresponding inference TPU 8i.
During the model training phase, NVIDIA's position in the AI market remains unshakable in the short term, thanks to its vast and mature CUDA software ecosystem and excellent hardware integration capabilities. However, once the model is complete and enters the "inference" phase for practical application deployment, industry players are more concerned with the "latency" and "cost-effectiveness" of a single response.
This also provides an excellent entry point for Groq, which was acquired by NVIDIA, as well as startups such as Cerebras or Fractile, the protagonist of this story, that focus on using SRAM memory or near-memory architecture.



