October 05, 1999, Volume 2 - Number 14
Data webhouse security is bestapplied to the web server
Remove Security from Your Database Tables
Ralph Kimball
Security is a topic most data webhouse managers would probably like to avoid. In a way, this statement is
surprising, because security is an intricate, fascinating puzzle, and data webhouse professionals
generally like to work on puzzles.
Security also involves a lot of interesting hardware and software technology, ranging from biometric
scanning devices for identifying users, to virtual private networks based on advanced encryption, to new
network servers with interesting names like LDAP. Security has a compelling people angle, because the
perpetrators of security problems (the hackers, crackers, and industrial spooks) have varying kinds of
pathological personality flaws that make them threats to computer systems and data webhouses. Finally,
security also is immensely important commercially, and management is probably willing to spend a lot of
money to get security right. So whats the reluctance to make security one of the main design topics
in the data webhouse, and in data warehousing in general?
Maybe the problem with security is that it is fundamentally an issue of control and avoidance of
problems. It is not the typical upside opportunity kind of topic typical of data warehousing.
Good data warehouse managers are really drawn to the business issues illuminated by the data warehouse.
They dont like to talk about controlling problems. And maybe the security problem seems so
complicated and so diffuse that no one knows where to start attacking it.
It is certainly true that the Web has made security problems more acute. All of us are rapidly joining a
fully connected world, where our networks are separated from the chaos of the public Web by just one or
two machines. Most of us have no real choice but to connect to the Web, to listen to the Web and to
deploy our services (and our webhouses) across the Web. One approach is to say that security is a
necessary part of the webhouse experience, so take your medicine, and keep reading.
Rather than providing a complete, broad tutorial on all the components of data webhouse security, in this
column I present a single recommended, coherent approach to security that addresses all of the main
issues confronting the data webhouse manager. This approach is highly consistent with current product
offerings from major vendors, such as Hewlett-Packards Praesidium product line. Think of this
article as describing one complete solution that you can either implement verbatim or contrast with
another solution you prefer. Either way, you will have a solid data point for comparison and
discussion.
A Recommended Security Framework
The security framework described here is meant for any user accessing sensitive data in your
organization. The framework is meant to cover employees, business partners, and customers with a single
security solution, regardless of their physical location. They can be inside your secure company intranet
or outside on the Web.
I chose this framework because it is simple yet powerful. Security is a continuous process, not a
one-time solution. The security manager in your data webhouse team will need to continuously visit and
revisit the elements of this framework, always adapting and strengthening the security protection. The
content of the data changes. The legitimate roles of the users change. Users themselves come and go from
the organization, and their personal roles change. Technology changes, and the threats posed by various
kinds of hackers and other disaffected individuals change. So, above all, you must treat security as a
dynamic process that is an ongoing part of your webhouse experience.
Our webhouse security framework is based on the following four elements (see Figure 1):
Two-factor authentication
A secure connection
Strong definition of user roles
Access to all webhouse objects controlled by the roles.
Two-Factor Authentication
The traditional text passwords we all use are the biggest single source of security breaches. It is not
an exaggeration to say that if we could decisively solve the password problem, most of our security
problems would go away. Passwords are easily guessed. Any password that is directly typed by a human
being is too short. Even encrypted lists of these short passwords are easily cracked. Users are notorious
for managing their personal passwords poorly. And you can hardly blame them. Passwords are a pain in the
neck.
In security parlance, a password represents one-factor authentication, because it is
what-you-know. Unfortunately, with one-factor authentication, if someone else knows your
password, they become you, with all your rights and privileges.
Security is immensely improved with
two-factor authentication. The second factor can be what-you-have or maybe
what-you-are. With what-you-have authentication, if you possess a unique piece of plastic (a
token) with an encoded magnetic strip, then combined with your password, you have a familiar
two-factor authentication system. We all use a two-factor authentication system when we present our ATM
card to the automated bank teller and withdraw cash. The two factors are the plastic token and our
personal identification number (PIN).
Although compromising a two-factor system is certainly possible,
it is immensely more difficult than just compromising a password. It takes a determined high-technology
attack to create the special plastic cards and generate the PINs. It seems likely that ATM-style
two-factor security is plenty good enough for data warehouse purposes, and we encourage this pragmatic
perspective. When evaluating security solutions, it is very important to keep the cost and complexity of
the solution in perspective. If you are dealing with an attacker who has access to the technology
required to make false security tokens or is willing to use violence or blackmail, most technical
solutions cannot provide very much protection.
If the data webhouse end user employs a two-factor
authentication system based on a plastic token and a password, every PC needs to be equipped with a card
swipe device. Figure 1 assumes the user at a PC anywhere in the extended webhouse system actively swipes
a plastic token and supplies a PIN in order to be authenticated. Card swipe devices for PCs are readily
available and declining in price. They can be external units connected through one of the I/O ports or
can be PCMCIA cards. PCMCIA cards are natural for portable PCs, but may not be very convenient for
desktop systems. The plastic tokens can contain an embedded microchip, and the more sophisticated
smart cards are essentially impossible to forge.
The other kind of two-factor authentication combines what-you-know with what-you-are. What-you-are is
determined by a biometric scanning device like a fingerprint detector, a signature analyzer, or a retinal
scanner. Theoretically this kind of authentication is better than a plastic-card system, because in this
case you can be pretty confident your intended user really is sitting at the PC. The only objection to
the what-you-are systems is the cost of the biometric input device and the reliability and hassle of
using it in real life. It would be infuriating to not be able to use your PC because for some reason your
thumbprint isnt recognized today. Maybe youre having a bad-thumb day. Some of the more
sophisticated what-you-are authentication systems are good choices for very expensive terminals like bank
teller machines, but probably not appropriate for cheap personal computers.
My bottom-line recommendation is to implement two-factor authentication based on passwords and plastic
tokens for all users of your webhouse, regardless of their location. Choose a technology that allows the
card swipe device to be attached to portable and fixed PCs. Make sure that the administration of the
cards and the passwords is practical. Understand what it takes to create a new card, invalidate an old
card, whether the cards are valid forever, and what the renewal scenario is.
Access All Webhouse Objects Through the Roles
Now that we have an authenticated user attached to one or more proper roles, all that is left is to use
these roles whenever information is requested. This is the step where the growing use of Web interfaces
potentially makes life simpler. If you take the view that every screen showing remote information is
delivered through a Web browser, you know that behind every Web browser is a Web server, which in turn is
controlled by an application server to determine what is painted on the screen.
The key to our webhouse security framework is to require that all access to remote information be
controlled by role-enabled application servers. In other words, every application server is modified so
that a page image is associated with a role, and the application server will deliver the page only if the
connected user possesses that role.
The role-enabled application server is the only place where you won't be able to connect off-the-shelf
products together to build your security system. Each application server must potentially be modified to
support role-enabled delivery. Although the task sounds daunting, this is still the best place to control
security. Application servers all have a roughly similar architecture. They all are responsible for
defining a page image to be rendered by the Web server. Most application servers can access and combine
information from many sources and many formats. This is why we don't dare try to apply a uniform security
solution below the application server. The individual sources of data are too granular and too varied in
format to control in a single comprehensive security solution.
A big bonus of the role-enabled application server approach is that it is just as easy to define access
rights to a complex multimedia report, such as a PowerPoint presentation, as it is to define access
rights to a traditional, low-level database table. This higher level view of security control is much
more appropriate in today's modern, multimedia information environment. Implementing security at only the
low level of data objects doesn't work.
A corollary to using role-enabled application servers is disallowing direct user connections to the
database machine. If any users on the company intranet can connect directly to the database and gain
access by only supplying a password, the whole system is defeated. At least two showstoppers arise.
First, these same users will argue persuasively that they need the same kind of direct access to the data
regardless of their location. They will claim they need it from a remote location within the company and
from home. Of course, their connection from these remote locations is likely to be over the Web. Second,
a direct connection to the database using only a password is likely to be implemented in clear text on
the local area network, making the entire interaction vulnerable to packet sniffers.
Administrative users including DBAs and system administrators, who obviously need to interact directly
with the primary data sources, must conduct their business from PCs connected to special isolated
networks that sit behind packet filtering gateways. These isolated networks cannot be sniffed from the
regular company intranet. Additionally, the PCs used by the DBAs and system administrators must be
physically secure.
Manage a Security Process, Not a Solution
I have described a framework for implementing security that can provide a very high degree of control.
But no security framework and no technology can be a solution by itself. The security challenge is
dynamic and continuously evolving. The data webhouse security manager must manage an ongoing process that
never ends.
A good security program continuously educates and motivates users to be aware of security, to be proud of
security, and to guard security. It is essential to have executive involvement in promoting security and
creating an example where the impositions created by security are understood to be part of doing business
in the real world. Most people are quite tolerant of airport security because they can see that their
safety is enhanced by the security procedures. So in the same way, company employees and other webhouse
users should be glad that the security mechanisms make themselves obvious, to a point.
A good security program is updated continuously. I have already talked about working closely with human
resources to update employee and contractor access rights. The security manager also keeps abreast of
security threats and new kinds of exploits aimed at Web sites and data warehouses. Virus definitions
should be updated very frequently.
A good security program constantly scans the entire system for possible vulnerabilities and actual
intrusions. There are many utilities that can profile a Web site for vulnerabilities to known attacks.
Consulting firms can periodically analyze a company's information infrastructure for weak points. This
may not need to be done frequently, but at least one such analysis is sure to be eye opening.
With this kind of security framework in place, the only other ingredient needed for the data webhouse
team is a dedicated security manager. This person should be assigned to the team and take directions from
the overall data webhouse manager. Only the data webhouse team has the combined knowledge of what is in
the data and who should be allowed to see it.
This article is an excerpt from a chapter in Ralphs newest book, The Data Webhouse
Toolkit, from Wiley Computer books, forthcoming in January 2000. Permission from Wiley to
publish this in Intelligent Enterprise, for the first time anywhere, is gratefully acknowledged.
Ralph Kimball, Ph.D., co-invented the Star Workstation at Xerox and founder of Red Brick
Systems, works as an independent consultant designing large data warehouses. He is the author of The Data Warehouse Toolkit (Wiley, 1996) and the newly published The Data Warehouse Lifecycle Toolkit (Wiley, 1998). You can reach him
through his Web page at www.ralphkimball.com.