UK company Fractile has today raised a $220M Series B as it continues to build next-generation inference hardware for AI.
The round was led by Accel, Factorial Funds, and Founders Fund, with participation from Conviction, Gigascale, O1A, Felicis, Buckley Ventures and 8VC.
Founded in 2022, Fractile is building next-generation inference hardware for frontier AI. Its thesis is that the next major limit on AI progress is the time and cost required to produce useful outputs at scale.
As advanced AI systems take on harder, longer-running tasks that can require tens of millions of tokens to generate, Fractile is developing chips and systems designed to make faster inference economically viable, spanning AI research, chip microarchitecture, and foundry process innovation.
According to a post by Walter Goodwin, CEO and Founder of Fractile, the company was founded on the bet that, eventually, the world’s most capable AI systems would be limited in their impact by the amount of time they take to produce useful outputs.
“We bet everything on the logical conclusion: that the only way to truly unlock this latent value, to make speed viable at scale, was to radically re-invent the hardware that we run our frontier AI models on.
Ever since, we have been building chips and systems that tackle this problem.”
He contends that since then, raw AI capability has already reached the point where time from query to output is the key limit to frontier capabilities. As models have improved, so has their ability to be orchestrated over increasingly long output sequences.
However, the unit economics of inference have become a brutal constraint.
“Inference is both the revenue engine of the AI industry and the rate-limiting factor on expanding it.“
He further contends that today’s LLMs are already producing up to 100 million tokens in pursuit of tackling hard problems.:
“At the ~40 tokens per second or so at which these models tend to run on existing chips, a single output of this length takes a month to complete.
The technical and economic limits on inference speed, above all from memory bandwidth that has failed to scale on current architectures, are what is constraining progress.
This is exactly the problem Fractile has been building from the ground up to tackle”.
Looking ahead, Goodwin sees value in not accelerating today’s workloads, but rather in the entirely new workloads that hardware like Fractile will enable.
The company is hiring across London, Bristol, San Francisco, and Taipei.