Predicting Oscar

Brokeback Mountain or Good Night, and Good Luck? Felicity Huffman or Reese Witherspoon? Philip Seymour Hoffman or Joaquin Phoenix?

With great fanfare, the Academy of Motion Picture Arts and Sciences recently announced the 2005 nominees for its prestigious film awards, commonly known as Oscars. The announcement set off the usual flurry of complaints about worthy candidates that were somehow overlooked and rampant speculation about who would win in each of the categories.

How easy is it to predict the winners? That’s the question that decision scientist Iain Pardoe of the Lundquist College of Business at the University of Oregon tackles in the current issue of Chance. He focuses on predicting the winners of the four major awards—picture, director, actor in a leading role, and actress in a leading role—from those nominated each year.

“Although many in the media (as well as movie-loving members of the public) make their own annual predictions,” Pardoe notes, “it appears that very few researchers have conducted a formal statistical analysis for this purpose.”

A wide variety of factors could serve as predictors, including other Oscar category nominations, previous nominations and wins, and other (earlier) movie awards. To tease out which ones are most significant and create a model for making predictions, Pardoe turned to a technique known as discrete choice modeling.

In a discrete choice model, an outcome is the result of several decisions—a sequence of choices—made among a finite set of alternatives by individuals in the population under consideration. The probabilities are calculated using a so-called multinomial logit model.

The Oscars have been awarded every year since 1928. Pardoe used data from years up to 1938 to make predictions for 1939, then cumulative data for each succeeding year.

Here are the variables that helped most in making accurate predictions, by category.

Best Picture

Total number of Oscar nominations.
Best director Oscar nomination.
Winner of a Golden Globe award for best picture or best picture (drama).
Winner of a Golden Globe award for best picture (musical or comedy).
Winner of a Directors Guild of America award for best director or a Producers Guild of America award for best producer.

Best Director

Total number of Oscar nominations.
Best picture Oscar nomination.
Number of previous best director Oscar nominations.
Winner of a Golden Globe award (1945–1950) or a Directors Guild of America award (from 1951) for best director.

Best Actor in a Leading Role

Best picture Oscar nomination.
Number of previous best actor Oscar nominations.
Number of previous best actor Oscar wins.
Winner of a Golden Globe award for best actor (drama).
Winner of a Golden Globe award for best actor (musical or comedy).
Winner of a Screen Actor’s Guild award for best actor.

Best Actress in a Leading Role

Best picture Oscar nomination.
Number of previous best actress Oscar wins.
Winner of a Golden Globe award for best actress (drama).
Winner of a Golden Globe award for best actress (musical or comedy).
Winner of a Screen Actor’s Guild award for best actress.

Curiously, in the best director category, it turns out that the number of previous best director wins tends to worsen the model’s predictions, even though the number of previous nominations helps. Conversely, in the best actress category, including a variable for the number of previous wins helps and including a variable for the number of previous nominations worsens predictions.

At the same time, including a variable for the total number of nominations improves predictions of the best picture and best director winners, but it worsens predictions of the acting Oscars.

Actor and actress nominee ages don’t appear to factor into predicting the winner. Neither does supporting actor nominations and wins, nominated movie genre (drama, musical, comedy, and so on), Motion Picture Association of America rating, release date, and movie awards other than the Golden Globes presented before the Oscars.

Pardoe’s model correctly predicted 186 of the 268 picture, director, actor, and actress wins from 1938 to 2004. This prediction accuracy of 69 percent is far above the 20 percent (assuming five nominees per category) that you might expect if the choices were random.

With the accumulation of data, the model’s overall prediction accuracy has improved over time. It was 81 percent for the span from 1975 to 2004. During this period, the proportions of correct predictions were 93 percent for director and 77 percent each for picture, actor, and actress.

“Each of the categories has become more predictable over time, particularly Best Actress, which was very hard to predict up until the early 1970s,” Pardoe says.

Pardoe’s analysis also allows you to pinpoint truly astonishing upsets: Hamlet over Johnny Belinda in 1948, Chariots of Fire over Reds in 1981, and Million Dollar Baby over The Aviator in 2004.

In the director category, Steven Soderbergh over Ang Lee (2000), Roman Polanski over Rob Marshall (2002), and Carol Reed over Anthony Harvey (1968).

In the actor category: Denzel Washington over Russell Crowe (2001), Cliff Robertson over Peter O’Toole (1968), and Art Carney over Jack Nicholson (1974).

In the actress category: Nicole Kidman over Renee Zellweger (2002), Katharine Hepburn over Faye Dunaway (1967), and Elizabeth Taylor over Anouk Aimée (1966).

We now have a fresh take on the “who’s best?” controversies that have often roiled the world of film. Pardoe’s predictions for 2005 can be found at http://lcb1.uoregon.edu/ipardoe/oscars/predict2005.htm.

Check out the MathTrek blog at http://blog.sciencenews.org/.