Keys and HistoryLet's ditch the punch cards once and for allby Joe Celko Let me take you back to those thrilling days of sequential file systems, magnetic tape drives, and punch cards. Magnetic tapes were essentially the same as the sequential model, but when you dropped a tape it wasn't a disaster like dropping a deck of punch cards. This physical model led to a system design style antithetical to the relational approach. Data, in the pre-relational approach, was located by its position within a file. Records were physically contiguous within files; and fields within records were contiguous to each other. If one record was an exact copy of another, it was different from the first because of its separate location. (I call it the evil twin.) Placement and orderSuccess depended on files with sorted order. In a procedural language, In SQL, When the relational model first appeared, the only mental model we had was the sequential
file system. This is why SQL has a Files always had keys in those days. And the first SQL products were implemented on top of existing file systems: Just map the table concept to a file, rows to records, and columns to fields! No need to learn anything new. Since most SQL programmers never worked with cards or tapes, it amazes me that they
rediscovered the underlying file implementation. Programmers often grab onto A relational key is a subset of attributes that identify a row in a table. Thus, an autonumbering scheme can never be a key by definition: It's not an attribute of anything except the machinery's internal state. Pointers and other physical implementation details fail as identifiers on the same principle. Table 1 shows a taxonomy of the methods used to make a row unique within its internal representation in a database. Other keysA natural key is a subset of attributes that occur in a table and act as a unique identifier the classic relational key. Keys are visible, and you can verify them in the external reality. Examples include UPC codes, geographical coordinates, and DNA. An artificial key is a visible attribute added to the table. It doesn't exist in the external reality but can be verified for syntax or check digits inside itself. For example, open codes in the UPC scheme can be assigned to a user's own stuff. The check digits still work, but you have to verify them inside your own enterprise. A "uniqueifier" (please, dear reader, suggest a better word), isn't based on attributes
in the data model and is exposed to the user. There's no way to predict or verify it. The
system obtains a value through some physical process totally unrelated to the logical data
model. Example: A surrogate key is a system generated to replace the actual key behind the covers where the user never sees it. It may or may not be based on attributes in the table. Examples: hashing algorithms or pointers from many columns back to a common set of domain values. Note: The terms "artificial key" and "surrogate key" are used interchangeably, but I think there's a big difference, as shown on the chart. This grid is a good place to start to talk about classifying types of keys. If anyone wants to add to it, please do. Joe Celko [celko@northfacelearning.com] is the vice president of RDBMS at North Face Learning in Salt Lake City and author of five books on SQL.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











