In this Issue: Garbage In, Garbage Out
The controversy surrounding the 2000 presidential election and the Florida recount is not just a once-in-a-lifetime civics lesson. It also provides an excellent opportunity to explore the profound business effects associated with questionable information quality. At least six data quality issues are evident from the Florida election: Poor data presentation. The use of the "butterfly" ballot in Palm Beach County is an example where the presentation of information did not correlate to users' expectations, leading to many errors. The fact that the top two punch-holes in the ballot did not correspond to the top two listed candidates apparently led a large number of people to vote for Buchanan instead of Gore. Validation of data before it was "used." The use of punch cards and the butterfly ballot led to questions regarding the accuracy of vote counting. Improperly punched ballot cards precipitated questionable tallies resulting from automated tabulation; thus, the validity of a vote is primarily based on whether the tabulation machine could read the card. Allegedly, approximately 19,000 ballots were disqualified because of "over-votes": more than one vote for president had been made. With no built-in mechanism to validate the data before it enters the system, there was no way to flag erroneously punched cards before they were submitted. Invalid analytic models. During election night, the use of invalid analytic models led to invalid conclusions about the winner. News reporting organizations' predictions are based on the results of exit polls and election results provided by a jointly owned data collecting consortium called the Voter News Service (VNS). Typically, the VNS feeds both vote counts and winner predictions to the news media, which is why all the different broadcasters seem to predict the winners all around the same time. In the case of last year's election in Florida, the networks twice predicted the winner of Florida incorrectly. The first time was due to the fact that predicting elections is predicated on statistical models based on past voting behavior that {1) were designed to account for vote swings an order of magnitude greater than the actual (almost final) tallies, and (2) did not take changes in demographics into account. Conflicting data sources. Different sources of data may conflict, leading to incorrect assumptions. By 2:00 a.m., the VNS (and consequently, the reporting organizations) had switched their allocation of Florida electoral votes from Gore to Bush. However, a second retracted prediction occurred when there was a dispute about the actual vote tallies. Although the VNS report indicated that Bush led in Florida by 29,000 votes, information posted on the Florida Board of Elections Web site indicated that Bush's lead was closer to 500 votes, with the gap narrowing quickly. In addition, a computer glitch in Volusia County led to an overestimation of Bush's total by 25,000 votes. Built-in margin of error. In Florida law, a priori expectation exists that errors will occur in how votes are counted. According to Title IX, Chapter 102 of Florida law, "If the returns for any office reflect that a candidate was defeated or eliminated by one-half of a percent or less of the votes cast for such office ... the board responsible for certifying the results of the vote ... shall order a recount of the votes." Inherent in this statute is an assumption of a margin of error of one-half of one percent of the votes. Thus, unless there is a true, validated count of all votes, the "real" winner is unknowable. Lack of timely certification. In the Florida election, timeliness constraints for the reporting and certification of votes had a significant effect on the final results, mostly because the requested hand recounts in Palm Beach and Broward counties were not completed by the certification deadline (one week after the election). In the knowledge management world, information policy serves as our body of "law." This policy delineates the guidelines and rules for the definition and use of data, as well as the consequences when these rules are violated. Thus, one lesson that the IT community can learn from this election is that well-defined information offers a way to manage risks associated with low data quality.
If we can learn from the errors that have taken place during the election and thereby improve the election process, then the nation as a whole will win.
In this Issue:
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| ||||||||||||||||||||||||||||||||









