The 1997/98 season was noteworthy for many reasons, and infamous for charges of national bias. With two major competitions - the Olympics and Worlds - there is plenty of "data" to test the extent to which judges were biased in the marks they give their countrymen at these competitions.
The calculations we describe here make use of all the placements handed out at the 1998 Winter Olympics and 1998 World Championships. Each part of an event is counted separately, and the qualifying rounds at Worlds were included in the calculations.
We began by calculating the deviations for each skater and each judge. The deviation is calculated by subtracting the full panel's placement of a skater from the ordinals for the individual judges. For example, if a skater places 4th in the short program and judges 1 through 9 give placements of 3, 4, 4, 5, 7, 2, 4, 5, and 6, then the deviations for the judges are -1, 0, 0, +1, +3, -2, 0, +1 and +2.
For any one judge in one part of one event the average of that judge's deviations will always be zero (in the absence of ties) since for every skater the judge places higher than the panel he must place another lower. This is true regardless of whether or not there are biases in the judge's marks. Thus, one cannot learn anything about biases by looking at a judge's average deviation for all skaters in one part of an event (or group of events). Looking at a judge's deviations for a subset of skaters in a competition, however is another matter.
For any statistically significant subset of skaters in a competition the average deviations for one or more judges will be zero (or close to it) only if the judges' placements are free of biases; i.e., are determined solely by random errors. Thus, for example, if the judges' deviations for skaters from their own country are averaged, the result should be close to zero if the judges' placements for their own skaters differ from the panels' only because of random errors of observation or judgement. If judges' placements are biased in favor of their countrymen the average deviations for those skaters will be significantly less than zero (i.e., a negative number). If the judges' placements are biases against their countrymen the average would be significantly greater than zero (i.e., a positive number). In the U.S., for example, Canadian judges are frequently accused of holding up their skaters while U.S. judges are accused of pushing their skaters down. If these accusations were true then the average deviation for Canadian judges marking Canadian skaters should be significantly negative, while the average deviation for U.S. judges marking American skaters should be significantly positive.
To test for national bias at the Olympics and Worlds the average deviations for all the judges marking their own countrymen in all events were calculated. The following table gives the results of this calculation. The more negative the number, the greater the presence of national bias. The table is listed in order of worst at the top to best at the bottom.
Country | Average deviation for all judges from each country in all events marking their own countrymen. |
DEN | -2.50 |
CHN | -2.00 |
JPN | -2.00 |
CAN | -1.91 |
HUN | -1.81 |
POL | -1.73 |
BUL | -1.67 |
SWI | -1.60 |
FIN | -1.33 |
FRA | -1.28 |
ROM | -1.25 |
AUS | -1.11 |
GER | -1.05 |
AZE | -1.00 |
SWE | -1.00 |
UKR | -0.79 |
CZE | -0.67 |
SVK | -0.67 |
LTU | -0.60 |
USA | -0.59 |
ITA | -0.56 |
GBR | -0.50 |
AUT | -0.40 |
BLR | -0.33 |
GRE | -0.33 |
RUS | -0.21 |
Based on results from the 1998 Olympics and World Championships, the worst offender was the judge from Denmark who placed her skater 2.5 places on the average higher than the panel. This result, however, does not have much weight statistically, since it involves only two placements.
Among the worst offenders are the judges from Canada who, in 21 cases, placed their skaters on the average approximately 2 places higher than the full panels. Equally damning, at the Olympics in 16 out of 16 cases the Canadian judges placed their skaters higher than the full panels, while at Worlds it was three out of five times. In total, Canadian judges placed their skaters lower than the panel in only 1 case out of 21 at the Olympics and Worlds.
The best judges (in terms of minimal national bias) turn out to be the Russian judges who in 38 cases had their skaters only one fifth of a place higher than the panels on the average. The U.S. judges are among the best judges (with judges from 19 out of 26 countries worse). In 27 cases, U.S. judges had American skaters higher than the full panels by 0.6 places on the average. Of those 27 cases the U.S. judges had American skater higher than the panels 12 times, were on-panel 10 times, and lower than the panel 5 times. The accusation that U.S. judges tend to mark down American skaters is not supported by these results. In fact, none of the countries had their skaters marked lower than the full panels on the average. All countries to some extent placed their skaters higher on the average than where they were placed by the full panels.
Based on the typical spread intrinsic in judges' marks we would claim, somewhat arbitrarily, that an average deviation in the range 0 to -0.67 is reasonable performance with respect to national bias (blue in the table), -0.68 through -1.33 is poor performance (black in the table), and an average worse than -1.33 (red in the table) indicates a serious problem with national bias.
Through a similar analysis other forms of bias can be investigated; and further, correlations in deviations can be used to statistically identify block judging and other forms of collusion. We save those entertaining topics, however, for another day.