Identifiers and Barbarians"Pardon him. Theodotus: he is a barbarian, and thinks that the customs of his tribe and island are the laws of nature." G.B. Shaw's Julius Caesarby Joe Celko Some readers expressed confusion about the points I made regarding artificial and natural keys in "Keys and History" (March 20, 2003). Let me clarify some of my examples, specifically vehicle identification numbers (VINs) and serial numbers on checks, which were discussed in detail online at IntelligentEnterprise.com One reader argued that a VIN was an artificial key, not a natural one. Wrong. A VIN is an intelligent key a very intelligent key. Each number in a VIN works to identify that particular vehicle. For example, the first three characters represent the manufacturer, make, and type of vehicle (with the exception of manufacturers that produce fewer than 500 vehicles). The next five characters identify vehicular attributes (such as passenger car, multipassenger vehicle, truck, trailer, and so forth). I won't explain every position in detail, but know that each character in a VIN gives you information specific to that particular vehicle. When I look at my car's VIN, and I look at the physical automobile, and the VIN tells me that my Ford Escort is a Yugo, I know that the VIN is wrong. When I see a production date in the VIN for the year 2004, I know that the VIN isn't just wrong; it's impossible. This is not at all the same thing as a sequential number that exists only in my table. At What Cost?The cost of data verification is inseparable from the cost of gathering the data and is, in fact, more vital because false data can be far worse than no data at all. That's my synopsis of a long, rambling quote from Charles Babbage on the navigation and trig tables of his day being of poor quality and therefore dangerous to the British Navy. The whole reason he wanted to build the Difference Engine was to improve data quality. At least one reader also thought that a check number is nothing more than the three- or four-digit serial number in the upper right corner of a paper check. It ain't! Check numbers carry a lot of intelligence in the standards and structure built into them. A check number has an ABA transit number that identifies the related bank or institutes; it's written in a standard MICR font; a MICR code symbol differentiates between debits and credits; and the retail banking package defines the upper limits on the serial numbers. The account number tells the bank what kind of customer holds the account. There are retail banking packages, so if you print your checks yourself, your home package has to follow some conventions that match the package from your bank. A bank customer is a mere user; the checking account system belongs to the bank. The serial number is so unimportant that if you switch banks, you can tell the new bank to start the your account with the next serial number from the old account. Think about what that says about the serial number as a key. State of MindWhen I was a kid, my grandparents used a party-line phone system. You called the operator and asked for a connection by a person's name a great system when fewer than 100 people lived in a county with even fewer telephones. The phone company used named exchanges in those days: the first two letters of a name followed by a digit. For example, KLondike 5 (or 555) is the dummy exchange used in movies. Once, my grandfather tried to call New York City. He had no concept of a "telephone number." Granddad assumed that the exchange name was the part of town that the phone was in. The major problem was conceptual. Granddad hadn't made the jump from the "roll your own local identifier, everyone knows what I mean!" mindset to the "there's an industry standard, everyone better learn it" mindset. It makes perfect sense to many newbie programmers to create some local identifier without any reference to any global system of which it is a part. Next time, we'll deal with commodity identifiers. Joe Celko [celko@northface.edu] is vice president of RDBMS at North Face Learning in Salt Lake City and author of five books on SQL.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











