The Application of Technology to the Judging Process

Part 4 - Implementation Strategies

This is the fourth of four articles on the use of computer technology in the judging of figure skating competitions. Part 1 discusses a rigorous approach to investigating the potential role of computer technology in the process of judging competitions, and describes the development of a computer model of the judging process as it currently exists. Part 2 describes many of the specific details of this computer model for the various figure skating events. Part 3 describes a software implementation of the model and includes a downloadable version of the annotation software developed with the model. Part 4 discusses implementation issues for a point model based scoring system.

Introduction

Much of the discussion about the merits of a computer based scoring system have centered on the development of the point model, the impact it has on the nature and identity of skating, and the ability of humans to judge competitions on an absolute point scale accurately, consistently and reliably.

On top of all these issues there is yet another level of complexity that has received scant attention from the ISU and only slightly more attention in general discussions; that being the question of how does one actually implement in hardware and software the intended system so that is operates reliably and can be used with confidence. In an article on her web site, Sandra Loosemore has discussed some of the documented cases in the last year where "glitches" in computer systems have created problems at USFSA and ISU competitions. These cases illustrate both the potential seriousness of the problem and suggest that the ISU has yet to address implementation strategies in their full complexity and with the sophistication the project requires.

As any longtime program can attest, it is one thing to throw together a program that works adequately for the person who developed it and knows all its ins, outs and quirks. It is quite another to write a bullet proof applications that any person can use without it hanging or breaking, and which anticipates every possible thing the user may do. When it comes to developing a complex system, such as what is being proposed by the ISU, there are a number of tried and true approaches that maximize the chance of success (and minimize the amount of grief) to improve system reliability. The following is a discussion of some implementation issues and strategies that are relevant to developing a complex computer based scoring system.

And as one might infer, we wouldn't be writing about these issues if we thought the ISU was doing it right.

Software Testing and Validation

In other articles I have written about testing and validating the point model. Those discussions were limited only to testing and verifying the mathematics of the point model. Having decided on the point model, it must next be implemented in software. This software must also be tested and validated to prove that it actually computes the point model the way it was intended; i.e., that the software is bug free.

Just like proof reading your own writing, testing and validating your own software always tends to let some fraction of bugs slip on through. Some of the examples Sandra Loosemore has discussed are clearly the result of inadequately tested software, others the result of a lack of flexibility built into the software.

For complex software used in critical applications where bugs cannot be tolerated, the standard approach is to set up a formalized, standardized testing and validation program for the software; and in general it is better to have the software tested and validated by a group separate from the development team. The ISU is not doing this. In fact, the ISU isn't letting any outside experts review its activities related to its project. This is a short sighted and dangerous approach.

First, no one for a moment believes that whatever system the ISU will have when it declares victory, will be the final word on the point model or the system. One can expect the point model and other aspects of the system will continuously be tweaked and modified over and over again. History shows that the skating world is incapable of leaving the rules of skating alone for more than about three months at a time. Every time the rules are changed in a way that changes the point model, the software will have to be retested and revalidated. This means there will be a constant chance of errors propagating into the software. To minimize that problem a permanent testing and validation process must be established, and a certification procedure established and used every time the software is revised.

Second, virtually no one trusts the ISU anymore. Conspiracy theorists will always be concerned that the software really doesn't do what the ISU says it is supposed to do. In the interest of credibility, it is essential that the ISU publish a complete description of the judging system point model and all related algorithms, and then turn over the testing, validation and certification to an impartial outside group. If the software in not certified by a credible outside source, one can expect there will always be substantial mistrust and suspicion of the results it generates.

Interface Design

In discussing implementation of the proposed system, many concerns have been raised questioning how one guarantees that the entries the judges intend to make are the entries actually recorded and used by the system. There are two levels to this question. In terms of public credibility, this again points to the need for an independent certification process that will verify there are no back doors or hanky-panky written into the software.

In terms of using the system, it is important the user interface be designed in such a way the judges have complete confidence in the system.

Have you ever used a computer application where you accidentally pushed the wrong button and all hell broke loose? Or you weren't sure the program actually did what you commanded it to do? If you haven't then you have never used a computer. It happens all the time, and it's the last thing you want to happen while judging a competition.

Another situation we all encounter in our modern world is the need to contact technical support. No matter how well the judges are trained, for a computer technology based system to work, technical advisors will have to be available continuously throughout an event to answer judges questions in real time, even perhaps as a skater is performing. How that would work, if it is even practical, is unclear, but needs to be thought out in advance. You don't want to have the situation in which a judge is inputting garbage to the system because they are not sure how to use the interface and no one is available to answer questions when needed.

In designing the user interface for a judging system it is essential that the interface provide positive feedback for every input so the judge knows exactly what the computer thinks the judge entered. The judge must have the ability to inspect and verify every entry and to change and correct any entry at any time before the marks are submitted at the end of a performance. The interface must have the maximum flexibility possible for order of entry so the judges are not distracted by concerns they are using the program incorrectly. The user interface for the CAJ^TM annotation software was set up with these design requirements in mind.

One additional aspect of designing the user interface is the amount of information the judges will have access to during a competition. Currently judges have a record of their individual marks and order of finish as an event progresses, since the judges currently are evaluating performances on a relative basis.

Under the proposed system, the judges are supposed to be marking on an absolute scale with each performance evaluated independent of any memory of, or information about, prior performances in the event. Strictly speaking, the only way to do that is to erase the memory of each judge after each performance, withhold access to all prior assessments the judges entered into the system, and withhold the display any marks in the arena during the competition. The first, of course, is impossible and the last would not be considered acceptable. Currently, the ISU interface is setup to deny the user access to the previous assessments.

One major question about the feasablility of the proposed ISU system, or any similarly constructed system, is whether humans can judge on an absolute scale with the accuracy and consistency required. If the answer proves to be "no", as many susspect is the case, then either a point model based system would have to be abandoned as a failure, or some change to a semi-absolute system would have to be considered. In a semi-absolute system, the users would try to enter assessments on an absolute scale as best they could, but would have access through the user interface to a summary of prior assessments and order of finish to help maintain consistency of judgement.

Hardware Issues

After software concerns there is the most terrifying fear, perhaps, of all -- that the system will crash in the middle of a performance and all assessments will be lost. ISU rules currently require that two computers be used for recording marks and running the accounting program (the USFSA has an identical requirement). It appears the ISU is not planning to have redundant hardware for its proposed system; one assumes due to the great expense involved.

It is estimated that a single installation of the hardware required for the new system will cost at least $500,000 and perhaps as much as $1,000,000. In addition, the current replay system takes up a huge amount of space. The amount of space required under the proposed system will be even greater, and doubling that with redundant hardware makes a difficult situation even worse.

Since there is no manual backup approach with the proposed system, a hardware failure would be a disaster. To reduce the risk of that, there are a number of strategies that can be followed in addition to requiring redundant computers.

A large capacity uninterruptible power supply (UPS) will be essential. The units that are capable of powering all the equipment involved for a reasonable period of time are huge, heavy and expensive. Don't leave home without one.

The use of some form of non-volatile memory would help insure information was not lost even if the power is lost or the computers crash for some other reason. Of course this memory can still fail anyway, for some other reason.

A RAID system with redundancy would reduce the risk of system failure due to disk drive problems. You know those video clips take up lot of space, and they all will have to be saved.

Then there it the security factor of last resort. Creating a real-time printout of each judge's inputs to produce a hard copy of everything that has transpired up to the point of a potential failure.

If one followed all these strategies, the chance of a system failure requiring a skater to reskate a performance would be very small. But it will never be zero. Stuff always breaks, and it frequently breaks at the most inopportune time. If a judge's touch screen craps out, if the judges' communication link to the computer is lost, if there is a memory failure, etc., a skater is going to be screwed. Their assessments will be lost and the only way of handling that situation seems to be to have them start over -- which doesn't seem fair, especially if it happens at 4:15 into a 4:30 program. The level of risk to be accepted is something that needs to be decided, and since the skaters will suffer the most in the event of a failure, the skaters and coaches should have the greatest say in deciding the degree of acceptable risk.

Procedural Issues

Other ways of addressing some operational concerns about the proposed system are procedural in nature. These procedural approaches would, in some cases, require a major shift away from the approach the ISU traditionally has taken in dealing with skaters and the public (secrecy and nonresponsiveness).

No matter how carefully the user interface is designed, the judges will sometimes enter erroneous information. One way to deal with this is to build into the system error checking software that is run on the judges' assessments at the time they commit their assessments to the system for all posterity. The CAJ software, for example, has an error checking routine that checks that all required information has been entered, looks for inconsistent entries, and identifies various inputs that might be considered odd. This was done to insure the scenarios being studied were being run on the correct inputs. Such a routine, could also be used to automatically query the judge to confirm those entries when errors or suspicious items are identified by the error checker.

At the end of a competition, a listing of all the data entries for all skaters from all the judges should be published and a copy given to each judge and skater, as well as released to the public. Skaters should have the ability to protest the identification of any element or any deduction (but not the assessments), since the identifications and deductions are matters of fact as opposed to matters of opinion. Protests would be resolved based on the video clip for each element captured during the competition. If the video clip supports the skater's protest the official results would be recalculated. The video clips would become a permanent part of the competition record along with all the judges inputs, and would be archived for a minimum of one year -- though the historian in me says they should be archived forever.

The scoring software must be made publicly available to gain public credibility. Skaters and coaches, of course, will need a copy so they can game the system to maximize the number of points they will attempt -- and make no mistake about it, game the system they will, as the experience in gymnastics has shown.

Time's a Wasting

The ISU continues to talk about using the proposed system during the Grand Prix next season. That is just seven months away. It is inconceivable that the ISU can come up with a fully developed implementation strategy in the time available, plus train all the judges to the extent needed. It is a major concern, that to meet the unrealistic deadline it has set itself, the ISU will cut corners and follow risky strategies that will result in a half-baked implementation with hidden bugs and flaws Sadly these bugs and flaws will only be discovered at the expense of the skaters competing in the events where the system is used, and they will be the ones who will pay the price for the problems. Questions concerning hardware and software implementation are yet another reason the ISU needs to slow down, and not implement the proposed system in any event next season as the scoring system of record.

Return to title page