Web edition: October 3, 2008
The single best school in the country is
Each of these schools could legitimately claim to be on top, according to a mathematical analysis, posted recently on ArXiv.org, of the data U.S. News & World Report uses to generate its influential and controversial rankings of American undergraduate institutions. It all depends, the researchers say, on what your priorities are.
The magazine uses seven key factors in its ratings,
including things like percentage of alumni who donate, acceptance rates for
admission, and spending per student. Lior Pachter of the
Techniques they’d developed for a completely different problem — aligning gene sequences to understand evolutionary changes — could be adapted to do just that, they realized. Biologists commonly analyze the differences between the DNA of two closely related creatures in order to understand how they evolved. To do that, researchers first have to decide how to line the two gene sequences up, identifying the segments that are identical and the places where DNA has have mutated or moved around or been deleted. But this alignment requires some guesswork: How likely, for example, it is that a gene will have mutated, and how likely is it that it simply will have been deleted? Biologists have little basis for deciding that, Pachter says, just as U.S. News has little basis for deciding how important one of its factors is for a particular person.
Huggins and Pachter had attacked this biological question using high-dimensional geometry, so they did the same for the educational data. They imagined each university as a point in seven-dimensional space, with one dimension for each factor that U.S. News considers. Although seven-dimensional space is hard to visualize, it’s easy to perform calculations on: Each point is represented by a sequence of seven numbers, just as a point in two dimensions can be represented by a pair of numbers. A university’s scores in the seven factors provide its particular sequence of seven numbers, and the universities thus form a cloud of points in seven-dimensional space. The researchers could then examine the “space” formed by all the universities by looking at the smallest flat-sided object (called a polytope) that contains them.
A particular set of priorities among the seven factors could also be represented in this same geometric space. Each of the seven numbers of the sequence this time would represent the relative importance of each factor. So, for example, a student who cares enormously about the research funding available at a university might consider that factor to be 70 percent of the decision and all the others to each be 5 percent. If research funding were the first factor in the list, that student’s priorities could be represented by the point (70, 5, 5, 5, 5, 5, 5). A student who cared especially about alumni satisfaction, as shown by their donation rates, might have priorities represented by the point (5, 70, 5, 5, 5, 5, 5).
Now imagine an arrow from the origin (the point whose coordinates are all zero) to the point that represents a particular student’s priorities. The researchers found something neat: If you extend that line until it hits the polytope, the university whose point is closest to where the line hits will represent the school that, according to that student’s priorities, is the best.
Finding the second or third best school, according to a particular set of priorities, required a bit more mathematical maneuvering but the same basic technique applied. The researchers then calculated the range of rankings a particular school could have according to all possible sets of priorities, excluding fluke rankings a school achieved only rarely.
The top schools, they found, were top pretty much regardless
of one’s priorities. Harvard and
Schools that were a bit more uneven could vary wildly,
“What we found is that these rankings are kind of arbitrary,” Pachter says. “If you care more about student-faculty ratios than alumni giving, you’re going to get a different ranking. It’s very biased to give only one view.” The pair argue that the magazine should release several different rankings, based on choices of a few representative sets of priorities.
“But that doesn’t sell magazines,” says Kevin Rask, an
One stumbling block Huggins and Pachter had to overcome is that U.S. News is secretive about some of its data. The magazine releases the precise values for the total score for each university and for three of the criteria, but the values of four of the criteria remain secret. So the researchers had to reverse-engineer what the individual scores for the secret criteria were likely to have been for each of the universities.
The pair point out that their methods can’t address another of the fundamental criticisms of the U.S. News evaluations, that the magazine chooses the wrong factors to base their evaluations on in the first place.
These techniques can be applied to any situation that requires ranking according to varying priorities, the researchers say. Similar lists are made, for example, of the best cities to live in, or the best graduate schools. Huggins and Pachter are now applying their methods to voting in elections with more than two candidates.