Crowdsourcing peer review

A claimed proof that P≠NP spurs a massive collaborative research effort

September 9, 2010 at 2:22 pm - More than 2 years ago

It is the greatest question in computer science. A negative answer would likely give a fundamentally deeper understanding of the nature of computation. And a positive answer would transform our world: Computers would acquire mind-boggling powers such as near-perfect translation, speech recognition and object identification; the hardest questions in mathematics would melt like butter under computation’s power; and current computer security methods would be as easy to crack as a TSA-approved suitcase lock.

So when Vinay Deolalikar, a computer scientist at Hewlett Packard labs in India, sent an email on August 7 to a few top researchers claiming that P doesn’t equal NP — thereby answering this question in the negative and staking a claim on the million-dollar Millennium Prize offered by the Clay Mathematics Institute — it sent shock waves through the community. Usually, computer scientists groan when they find such a claim in their Inbox, expecting the typical amateurish “proof” with the same hoary errors. But Deolalikar is a recognized and published scientist, and his paper had novel ideas from promising areas of research.

The paper spurred an intense, open, Internet-based effort to understand it and pursue its ideas, attracting such luminaries as Fields Medalists Terry Tao and Timothy Gowers. The examination uncovered deep flaws that are probably irremediable — but has also helped spur on a new model of research.

The question of P versus NP is both profound and basic: If the solution to a problem can be verified quickly, can it also be found quickly? The difference between the two will seem obvious to anyone who has spent hours searching for mislaid keys and then recognized them in an instant — which is why the vast majority of computer scientists believe that P does not equal NP. But so far, no one has been able to prove it.

More precisely, P versus NP concerns the number of computational steps needed to perform a calculation. For example, the standard algorithm to multiply two numbers, each having n digits, requires on the order of n² computations. Since n² is a nice, tidy polynomial, this means that multiplication can be calculated in “polynomial time,” i.e., it is in the class of problems called “P.” Roughly speaking, problems in P are the ones that can practically be solved with computers.

But not all problems are so nice. Consider the “Willy Loman problem”: Loman, a traveling salesman, wants to visit n cities, and his boss will reimburse him for only m miles. Is there a route that would allow him to visit n cities by traveling only m miles? The fastest algorithms currently known that answer this question require more than 2ⁿ calculations. Since n is in the exponent, this means the calculation will require “exponential time,” which for large n is impractically long. To find a route for 100 cities, for example, would require more than 2¹⁰⁰ (or about 10³⁰) calculations, compared with only 10,000 calculations to multiply two 100-digit numbers. Once the right route is revealed, though, verifying it is a snap: Just add up the miles. Such problems, which may be hard to solve but are easy to check, are said to belong to “NP,” which stands for “nondeterministic polynomial time.”

“NP is a very interesting class philosophically, because it includes almost anything you’d hope to do,” says Neil Immerman of the University of Massachusetts, Amherst. The Willy Loman problem, for instance, has vast applications in everything from airline scheduling to computer chip design to reconstructing genomes. Even the questions of whether manageable proofs exist for the five other remaining Millennium Prize problems are NP problems — so if someone manages to prove that P equals NP, he or she could imaginably walk away with not just one million dollars, but six.

Although NP problems certainly appear to be vastly more difficult than P problems, there’s a niggling uncertainty. Computer scientists sometimes find extraordinarily clever, quick ways of performing calculations. A different way of multiplying, for example, requires only n^1.59 calculations rather than n². Who knows? Someday, someone might find a way of computing the Willy Loman problem in polynomial time. In that case, that NP problem would actually belong to P — and by a remarkable result from a few decades ago, so would all other NP problems.

Computer scientists are nearly certain this can’t happen (so that P really doesn’t equal NP). But proving it is exceptionally difficult because doing so requires pinning down all possible polynomial algorithms — including all those no one has thought of yet — and showing that none are sufficient to do the job. But algorithms are wily, Protean objects, able to shift and change in surprising ways. Richard Lipton, a computational complexity expert at the Georgia Institute of Technology, describes a Far Side cartoon in which Indians are shooting burning arrows at a fort, and one cowboy inside says to another, “Are they allowed to do that?” Lipton laughs: “To me, that describes in a nutshell what’s so hard about understanding algorithms. They can do all kinds of crazy things.”

Deolalikar claimed that he had tamed the wildness of algorithms and shown that P indeed doesn’t equal NP. Within a few hours of his e-mail, the paper got an impressive endorsement: “This appears to be a relatively serious claim to have solved P versus NP,” emailed Stephen Cook of the University of Toronto, the scientist who had initially formulated the question. That evening, a blogger posted Deolalikar’s paper. And the next day, long before researchers had had time to examine the 103-page paper in detail, the recommendation site Slashdot picked it up, sending a fire hose of tens of thousands of readers and dozens of journalists to the paper.

Researchers and fascinated laypeople alike scrutinized it. They swarmed to Lipton’s blog to share their observations. For the next week, a dozen or so top researchers abandoned their other projects to examine the paper. An informal peer review process emerged — collaboratively, in full public view, no credentials required.

Deolalikar combined two different approaches, one from logic and one from statistical physics. He translated the problem of P versus NP out of computer science entirely and into the world of formal logic, using an area known as “finite model theory” that has produced remarkable results.

Then he focused on one particular NP problem called “Boolean satisfiability,” which asks whether any binary numbers can satisfy a given set of logical requirements all at once. Though no efficient algorithm to compute Boolean satisfiability is known (the problem being in NP), computer scientists do know a lot about it. For example, choose k logical requirements at random and ask: What is the probability that there is some binary number of n digits that will satisfy them all? If the number of requirements is huge (i.e., k is large) and the number of possible solutions is tiny (i.e., n is small), then of course there will never be correct solutions, no matter how long the problem is calculated. Similarly, if the requirements are few and the number of possible solutions is vast (i.e. k is small and n is large), there will always be solutions. At some critical threshold in between, there’s a transition point when solutions become possible. Statistical physicists have studied this kind of “phase transition” deeply and found a rich structure in the probability distributions found there.

Deolalikar claimed that the restrictions on polynomial-time algorithms made evident by finite model theory were too great to produce this rich statistical structure. That would mean that no polynomial-time algorithm would be able to solve the Boolean satisfiability problem, and hence that P and NP are different.

Experts were intrigued by the novelty of combining logic and statistical physics — but many were also skeptical. Scott Aaronson of MIT bet $200,000 that the proof wouldn’t stand up, even before he read it in detail. “This problem won’t be solved without a titanic advance in human knowledge,” he wrote on his blog, and he saw little evidence that Deolalikar’s paper provided that. He acknowledged, though, that the paper “seems to introduce some thought-provoking new ideas.”

As researchers began to delve into the paper, they rapidly found problems. Ominously, Deolalikar’s claims seemed to contradict things that were already known. Cris Moore of the University of New Mexico pointed to a problem in P that has a similarly complex structure to that of Boolean satisfiability, contrary to Deolalikar’s central claim. Then, Immerman, one of the originators of finite model theory, found that Deolalikar had overstated the restrictions imposed by the finite model theory translation.

In a normal peer review process, this probably would have been the end of it. But the crowd that had convened on Lipton’s blog — an extraordinary array of powerful mathematicians, computer scientists, physicists and logicians — were still intrigued. Could anything come from Deolalikar’s new approach? Perhaps, they thought, his methods could be valuable in showing that some set of algorithms smaller than P is not equal to NP.

Over the course of a week, the crew posted over 1,000 comments on Lipton’s blog. They eventually identified another key error: Tao pointed out that while analyzing the distributions, Deolalikar makes the intuitive assumption that a projection of a set, which acts much like a shadow, will be smaller and simpler than the set itself. But just as shadows can be much larger than the original objects (for example, when the light source is nearby), projections can be far more complex than their originals. This mistake was so deadly, the researchers concluded, that nothing substantive was likely to come from the approach.

Deolalikar hasn’t withdrawn his article. He claims to have fixed the errors and will be submitting the revised article to a journal to go through an ordinary peer review process.

Even though the blog commenters believe that the paper is unlikely to yield new results, most participants found the discussion galvanizing. The process was enormously addictive, sucking logicians into learning statistical physics and mathematicians into grappling with computer science. This interdisciplinary engagement with the problem is perhaps the most significant outcome of the effort. “Even at a conference you don’t get this kind of interaction happening,” says Suresh Venkatasubramanian of the University of Utah. “It was like the Nerd Superbowl.”