# Here’s how we could begin decoding an alien message using math

A new mathematical approach looks for order in strings of bits

One of the most famous messages ever beamed to space was a string of 1,679 bits sent by the Arecibo radio telescope in 1974. But if E.T. sent us such a string, how could we Earthlings even begin to decode it? A new mathematical approach proposes a way.

For anyone trying to interpret the Arecibo message — a drawing depicting a person, the DNA double helix, the solar system and the telescope itself, among other information — they’d first have to understand that it was an image at all, and that the image was 23 pixels wide and 73 pixels tall.

As it sent the signal, the radio antenna encoded the 1,679 bits by flipping between two different frequencies, representing one and zero respectively. If you line the bits up differently — placing more or fewer than 23 pixels per row — the image looks like a random mess.

We’d face a similar challenge if aliens sent us a message. How would we know the number and size of its dimensions?

The Arecibo scientists built a clue into the transmission: 23 and 73 are prime numbers — a scheme other intelligent life might recognize, if they too find primes to be interesting. But alien messages could come in many forms and have many dimensions, says Brian McConnell, a computer scientist at Notion Labs in San Francisco, and author of The Alien Communication Handbook. A message might be a database in which each element is not just a value but a list of values, or a list of lists. A message in the form of a physics simulation could include a series of measures for each point in spacetime.

The new decoding method, developed by Hector Zenil, a computer scientist at the University of Cambridge and the founder of Oxford Immune Algorithmics, and colleagues, takes a string of bits — an incoming message — and looks at every possible combination of dimension number and size. 100 bits, for example, might be 1×100 or 10×10 (two dimensions) or 4x5x5 (three dimensions) or 2x2x5x5 (four dimensions) and so on.

It then looks at each possible configuration’s orderliness in two ways. To get a measure of local order, it breaks the message into patches. For each patch, it searches a catalog of trillions of tiny computer programs the researchers had previously created to explore algorithmic space, and counts how many programs generate an identical patch. (The programs’ outputs were precomputed and saved, making searches fast.) The more programs that create an identical patch, the higher the patch’s score for local order. The patch scores are averaged to get an overall local order score for the entire configuration. The researchers also measure each possible configuration’s global order by seeing how much an image compression algorithm can shrink it without losing information — mathematically, randomness is less compressible than regular patterns. By combining the local and global scores, the researchers have a sense of how likely each configuration is to be the correct one.

The team tested the method on a version of the Arecibo message that had been expanded to six times its size, so the width was now 138 pixels. In one analysis, the researchers arranged the sequence of bits into images ranging from 0 to 200 pixels across, a subset of possible configurations. Graphing image width on the x axis and the likelihood score on the y axis for each configuration, there were a few sharp spikes, the most prominent at 138. The method showed similar success when parsing other messages encoded as bits, including several other images, an audio file and a 3-D MRI scan.

The new approach could also handle the kind of noise that might be introduced as a message travels through space. In another analysis, the original Arecibo message’s width of 23 pixels stood out even when a quarter of the bits had been flipped from 1 to 0 or vice versa.

“This paper is quite exciting, because what we have shown is that if you have a piece of information that is not completely random, then it actually encodes the original space in which it was intended,” Zenil says. In other words, the message tells you its own geometry. He notes that in Carl Sagan’s sci-fi novel Contact, and the movie based on it, the characters spend a lot of time figuring out that a message received from aliens is in three dimensions (specifically a video). “If you have our tools, you would solve that problem in seconds and with no human intervention.”

Even if aliens send a continuous signal rather than bits, he says, the method could help find the right sampling frequency for digitizing it. It would just add more configurations to try.

“What I like about it is that it’s a mathematically rigorous approach to characterizing a transmission,” McConnell says of the technique, which has not yet been peer reviewed. What’s more, “most of the people in the SETI community” — referring to the search for extraterrestrial intelligence — “focus on signal detection. They don’t tend to give a lot of thought to what would come after that.”

SETI researcher Douglas Vakoch, the president of METI International, a nonprofit that studies how we might message extraterrestrial intelligence, notes that the new approach frees prime numbers to serve a secondary purpose in parsing a message. “Instead of being a guide to discover the format, they can now be used to confirm that the decoders found the correct solution,” Vakoch wrote via email.

(“Primes are somehow very special in a mathematical sense,” Zenil notes, “because they can be thought of as a compressed version of the natural numbers.” But there are also other types of interesting numbers to choose from, many listed in the On-Line Encyclopedia of Integer Sequences.)

Of course, even if we could detect and format the message, we’d still need to interpret it correctly. Might a shape indicate an alien body, a spacecraft, an equation or a smudge?

Zenil notes that the approach has potential terrestrial applications, for instance in deciphering intercellular signaling. He’s also already used conceptually similar methods to identify important components in gene regulatory networks — if you perturb one part, does it make the overall system less intelligible? An algorithm that pieces together smaller algorithmic components in order to explain or predict data — this new method is just one way to do it — may also help us one day achieve artificial general intelligence, Zenil says. Such automated approaches don’t depend on human assumptions about the signal. That opens the door to discovering forms of intelligence that might think differently from our own.