A quantum trick helps trim bloated AI models
Tensor networks grapple with the complexities of both quantum particles and machine learning
Illustration by Hawaii
A hunk of material bustles with electrons, one tickling another as they bop around. Quantifying how one particle jostles others in that scrum is so complicated that, beginning in the 1990s, physicists developed an esoteric mathematical structure called a tensor network just to describe it. A decade or so later, when quantum physicist Román Orús began studying tensor networks, he didn’t envision applying them to the seemingly unrelated concepts of artificial intelligence.
But with the advent of enormous, energy-hogging large language models like those behind ChatGPT, “we realized that by using tensor networks we could address some of the bottlenecks,” says Orús, of Donostia International Physics Center in San Sebastián, Spain. Tensor networks can help squish bloated AI models down to a more manageable size, cutting energy use and improving efficiency without sacrificing accuracy. That’s Orús’ aim in his work at Multiverse Computing, a startup he cofounded. It’s an appealing prospect: AI currently gobbles so much energy that tech companies are hatching plans for a future generation of small nuclear power plants. And the need to power AI data centers may already be helping to drive up electricity costs in some areas.
Smaller models also boast the potential to be crammed onto personal devices like cell phones or household appliances. The ability to put AI on the devices themselves — rather than running it through the cloud — means users wouldn’t need an internet connection to use the AI.
There are other ways to compress AI models. But tensor network proponents argue that the technique’s basis in physics and math can provide more of a guarantee that the compressed model will perform as well as — or even better than — its big sibling. “It seems like kind of a slam dunk every time people try it,” says physicist and tensor network enthusiast Miles Stoudenmire of the Flatiron Institute in New York City.

But Stoudenmire wants to push tensor networks even further.
Most popular AI models are based on a framework called an artificial neural network that is inspired by the neurons of the human brain. Whereas Orús and colleagues are recasting those existing models as tensor networks, Stoudenmire and others aim to make AI models that bypass neural networks entirely, basing them on tensor networks from the get-go. Neural networks are powerful and flexible tools. But training them demands lots of energy and computer time. And they produce AI models with inner workings that are difficult to comprehend. Starting from a tensor network foundation, instead, could make AI faster and easier to train and understand.
“Let the tensors breathe,” Stoudenmire says. “I want to free them from the neural network and let them do their own thing … because I think they have a lot of latent power to offer.”
How the tensor network sausage is made
Tensor networks are physicists’ answer to a hair-raising concept called the “curse of dimensionality.” It’s the idea that, as data become more complex and involve many variables, they dramatically explode in size, making computer storage impossible.
The building blocks of tensor networks are mathematical objects known as tensors. If you’ve ever used a spreadsheet, you may understand how powerful tensors can be. A spreadsheet is, effectively, a matrix, an array of numbers in two dimensions. Tensors generalize this idea to multiple dimensions.

Say you want to describe 10 people and their rankings for 10 possible pizza toppings. Jared gives pepperoni a 10, and Kate gives it a 3, and so on. You’d fill out a 10 by 10 spreadsheet.
But what if you wanted to describe not just toppings, but different sauce types, too: white sauce, marinara, pesto. What you’d need is an order-3 tensor. One number gives Kate’s ranking for a pizza with red sauce and pepperoni, another for Jared’s ranking of pizza with white sauce and mushrooms.
When dealing with a small number of variables — people, pizza toppings, sauces — such tensors are manageable. If you did a massive pizza survey, polling 100,000 people with 100 choices for toppings and 100 sauces, that would result in a tensor with 1 billion numbers, easily storable on a computer. But once you start dealing with many variables — if on top of people, pizza and sauce you add crust, cheese and many other options — the size of a tensor quickly balloons.
Once a tensor has more than a few tens of variables, “it would take … as much memory as has ever been produced in the history of computing to store,” Stoudenmire says. That’s the curse of dimensionality. For computer scientists — who tend to huck around huge clods of data — it’s a vexing problem. And for quantum physicists, the curse rears its head when describing many particles interacting with one another in complex ways.
Enter tensor networks. Harnessed by physicists in the 1990s and 2000s, they represent one colossal tensor by breaking it up into smaller, more manageable tensors. Those smaller tensors are linked by contractions, operations that combine two tensors into one.
Stoudenmire compares it to taking a giant sausage — too much for one person to cook, let alone eat — and twisting it in places to make perfectly portioned hot dogs, sized for the grill.

Here’s how that sausage is made. Tensor networks are adept at representing correlations, connections between the results of different measurements. In a pizza survey, for example, people who like white mushrooms on their pizza probably also like cremini mushrooms — the two survey responses are correlated. Tensor networks are an efficient way of representing data that have correlations.
In AI, there are correlations between the billions of numbers called parameters that determine, for example, how a chatbot processes users’ prompts. And correlations in data can signal redundancy. By eliminating that redundancy, tensor networks can compress a model without weakening its power.
The synergy between AI and quantum physics also comes down to correlations. Quantum particles are paragons of correlation through the effect of quantum entanglement, which links the fates of two seemingly distinct particles.
“We are just finding here, in AI models, what we learned in physics, that correlations matter, period,” Orús says. “Everything is about correlations.”
A llama gets littler
Models based on neural networks typically already contain simple tensors, such as those spreadsheet-like matrices. Matrices or other tensors hold the parameters that tell the model how to process data via nodes, individual components of the model that are inspired by neurons. In a deep learning model, there are multiple layers of nodes, with associated tensors that contain parameters. But tensor networks have the power to represent data more efficiently than those individual tensors on their own.
Orús’ startup has commercialized a tensor network–based compression technique for AI models, called CompactifAI. When applied to the large language model Llama 2 7B, CompactifAI reduces the memory required to store the model by more than 90 percent, going from about 27 gigabytes to about 2 gigabytes. It shrinks the number of parameters by 70 percent, taking it from 7 billion parameters to about 2 billion with an accuracy drop of just a few percent, Orús and colleagues reported in a paper presented last April at the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning in Bruges, Belgium.
Total energy consumption of a compressed and uncompressed AI model

Multiverse’s compressed version of the large language model Llama 3.1 8B produced responses to 104 questions using less energy than the full-sized model. The energy saved was more dramatic for longer responses (bottom) than shorter responses (top).
Multiverse’s compressed version of the large language model Llama 3.1 8B produced responses to 104 questions using less energy than the full-sized model. The energy saved was more dramatic for longer responses (right) than shorter responses (left).
A report by the European consulting firm Sopra Steria found that Multiverse’s compressed version of a different model, Llama 3.1 8B, used about 30 to 40 percent less energy than the original version, depending on the length of the response.
Other methods for shrinking AI models can also improve energy efficiency. A technique called pruning removes the least important parameters or nodes from the model, and a method called quantization reduces the precision of the parameters, for example by going from decimal numbers to integers. But machine intelligence researcher Danilo Mandic of Imperial College London says those techniques are reliant on trial and error. “There is no guarantee of good or improved performance.”
Tensor networks, designed to tease out the hidden structure in data, allow the model to be compressed and still perform well. Compressed models can even surpass the big ones in accuracy, Mandic says. That’s because big models are trained on large swaths of data from the internet, containing plenty of redundancies and irrelevances that get filtered out by the tensor network compression.
A tensor network–compressed version of OpenAI’s GPT-2 large language model performed similarly to or even better than full-size GPT-2, Mandic and colleagues reported in a paper published in 2023 at arXiv.org. And the mini model ran on a Raspberry Pi — a cheap, credit card–size computer often used for computer science education.
Multiverse has continued developing smaller models. Two released in August 2025 are named after animals that Multiverse says have similarly simple neural architecture in their brains. The company is marketing the models — SuperFly and ChickenBrain — for personal devices and appliances such as refrigerators and washing machines. For example, a clueless teenager could ask a washing machine which type of cycle to run.
Letting tensors breathe
To compress an AI model, you have to have one to start with. Creating and training that original model is itself an energy-sapping saga. Using tensor networks from the beginning could ease energy needs in that stage, too.
Training the neural network in an AI model like that of ChatGPT requires a lengthy process of optimization, tweaking parameters and checking the resulting performance, in order to find the best values for the parameters. This step typically relies on a process called gradient descent, originally devised in the 19th century. Stoudenmire likens it to looking for a plate of food in your house by wandering around hoping you can catch a whiff. “It’s not stupid, but it’s pretty basic.”

Right now, deep neural networks are the basis for the most successful AI models out there. But some researchers are working to create an alternative that could complement that technology. Models based on tensor networks would eliminate the neural network entirely. And it could eliminate that process of optimization, that groping about the house searching for your forgotten leftovers. “We don’t want to use optimization at all,” says applied mathematician Yuehaw Khoo of the University of Chicago. “This is the main selling point of using tensor networks over deep learning architecture, the possibility of completely bypassing the use of optimization.”
To avoid the need for optimization, tensor network methods can use a “divide and conquer” strategy. Parts of the tensor network are frozen while others are adjusted to each solution.
A related tensor network technique involves zooming in and out to help find a solution. For example, imagine that rather than roaming around searching for a plate of food, you could isolate individual floors. Maybe you sample the air of the entire first floor of the house all at once for the scent of food. If present, you zoom in, searching each room, then the different surfaces in the room.
In these ways, tensor networks can settle, not on the location of the food, but on values of parameters in the tensor network. The techniques mean models can be trained within seconds. In a scientific flex, Siyao Yang, an applied mathematician in Khoo’s group, demoed training a tensor network–based model in the middle of a scientific talk. It took four seconds. A similar model based on neural networks took about six minutes, nearly 100 times as long.
But the divide and conquer strategy also means that tensor networks have a limitation. They work best if the structure of the problem is well understood, in order to know how to divvy the problem up. For example, when searching for that plate of food, perhaps you know the layout of the house with its floors and rooms.
That makes the technique work best on problems that have some known structure, like those that are described by laws of physics. For example, an AI based on tensor networks can evaluate complex equations related to the properties of materials such as copper, argon and tin, researchers reported last August in Physical Review Materials.
AI based on tensor networks is useful in robotics too. In a paper published last July at arXiv.org, researchers at Idiap Research Institute in Martigny, Switzerland, used tensor networks to teach two robotic arms to manipulate a box.

Explaining the inner workings of AI
Tensor networks can also make for more understandable AI. Deep learning models, with their myriad parameters, are infamous for being black boxes, with little possibility to extract the reason behind a model’s response.
“There’s very little theoretical understanding about what’s actually happening with deep learning,” says computer scientist Rose Yu of the University of California, San Diego.
That obscurity holds neural networks back from tasks where a slipup would be disastrous. “You cannot employ a neural net to run your nuclear power plant if you don’t understand how it works,” Mandic says.
Yu has used tensor network methods to analyze information such as climate data and the shooting success of basketball players from different places on the court. The mathematically well understood tensor networks, she argues, lend themselves to results that are easier to grasp.
“Tensors, because they’re tools that are very well understood from a theoretical
perspective … may offer a new type of platform to study the behavior of deep networks, to understand the science behind deep learning,” Yu says.
Meanwhile, tech companies continue to release bigger, more complex models, trained on more data. “The current trend in AI seems to be [that] the ultimate answer to everything is just scaling,” Yu says.
But the era of improving performance simply by going bigger may be petering out. Tensor networks provide an alternative paradigm to explore. “Can we derive new insights from tensor networks that can help guide a new wave of development for AI?” Yu asks.
Neural networks still outperform tensor networks on most tasks. But perhaps, Khoo says, that’s partly due to the intense focus on neural networks over the past decade, and the relative neglect of tensor networks.
Putting more effort into tensor network research could mean we eventually get more out of them, Khoo says. “With enough tuning, I’m pretty sure tensor networks can win.”