When it comes to the neural networks that power today’s artificial intelligence, sometimes the bigger they are, the smarter they are too. Recent leaps in machine understanding of language, for example, have hinged on building some of the most enormous AI models ever and stuffing them with huge gobs of text. A new cluster of computer chips could now help these networks grow to almost unimaginable size—and show whether going ever larger may unlock further AI advances, not only in language understanding, but perhaps also in areas like robotics and computer vision.
Cerebras Systems, a startup that has already built the world’s largest computer chip, has now developed technology that lets a cluster of those chips run AI models that are more than a hundred times bigger than the most gargantuan ones around today.
Cerebras says it can now run a neural network with 120 trillion connections, mathematical simulations of the interplay between biological neurons and synapses. The largest AI models in existence today have about a trillion connections, and they cost many millions of dollars to build and train. But Cerebras says its hardware will run calculations in about a 50th of the time of existing hardware. Its chip cluster, along with power and cooling requirements, presumably still won’t come cheap, but Cerberas at least claims its tech will be substantially more efficient.
“We built it with synthetic parameters,” says Andrew Feldman, founder and CEO of Cerebras, who will present details of the tech at a chip conference this week. “So we know we can, but we haven’t trained a model, because we’re infrastructure builders, and, well, there is no model yet” of that size, he adds.
Today, most AI programs are trained using GPUs, a type of chip originally designed for generating computer graphics but also well suited for the parallel processing that neural networks require. Large AI models are essentially divided up across dozens or hundreds of GPUs, connected using high-speed wiring.
GPUs still make sense for AI, but as models get larger and companies look for an edge, more specialized designs may find their niches. Recent advances and commercial interest have sparked a Cambrian explosion in new chip designs specialized for AI. The Cerebras chip is an intriguing part of that evolution. While normal semiconductor designers split a wafer into pieces to make individual chips, Cerebras packs in much more computational power by using the entire thing, having its many computational units, or cores, talk to each other more efficiently. A GPU typically has a few hundred cores, but Cerebras’s latest chip, called the Wafer Scale Engine Two (WSE-2), has 850,000 of them.
The design can run a big neural network more efficiently than banks of GPUs wired together. But manufacturing and running the chip is a challenge, requiring new methods for etching silicon features, a design that includes redundancies to account for manufacturing flaws, and a novel water system to keep the giant chip chilled.