CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





January 30, 2001



Get the Complete Picture

The data architect plays one of the most critical -- yet misunderstood -- roles on the e-business development team

By Rajan Chandras

The key to a successful Web site is solid infrastructure and data not catchy interfaces, cutting-edge design, or flashy graphics. You need strong content and complete data to reach new customers and keep them happy. And, without a data architect in charge of this information, you may be seriously jeopardizing your e-commerce venture.

Unfortunately, many project teams are completely unaware of this crucial position and fail to realize that every dot-com project needs a designated and experienced data architect on board. This omission is usually because of a lack of understanding of what a data architect actually does, compounded by an erroneous definition of the DBA role. Consequently, the data architect is not hired. The result? Poor product quality, delays, cost overruns, or even project failure.

All too often, the dot-com venture's CTO is given an urgent mission: "Here's $3 million for Web site development. You have 90 days to get it up and running." And from this point on, the pressure is on the CTO to get the site operational within a specified (and usually aggressive) timeframe. The CTO typically bears the ultimate responsibility for the Web site, and in a sense, the entire business. Given this mandate, the seasoned technology manager brings aboard (as either employees or consultants) the best available technical architect, application architect, business analysts, creative designers, Java developers, testing team, systems administrator, DBA, and a project manager to pull it all together ... and the project is all set to go, right? Wrong. The project team is missing a crucial role: the data architect.

Sadly, you are not alone; I suspect that many others are following the logic: "If you have a DBA, you've got data architecture well covered." In most cases, this approach is shortsighted if not an outright mistake and an invitation to serious future risk to the e-commerce project. Sometimes you assume that data architecture is the responsibility of the application architect. But this assumption is misguided as well and puts an unfair burden on the application architect whose primary role is to figure out how to design the Web application to provide the required functionality and performance.

Although application architecture has broad scope, this architect's primary focus is the end-to-end business transaction and Web site behavior, with a focus on the middle tier such as the application server. On the other hand, the technical architect is typically working at infrastructure issues: Web servers, firewalls, hubs, switches, partially protected networks (DMZs), secure connections, heartbeat interconnects, network- and system-level fault tolerance, and more. Then, there is the DBA. There is much confusion in the marketplace about the DBA's role, but suffice it to say here that the DBA will be very busy administering the various databases for the project.

So where do data architects fit in this scenario? They take care of the gaping hole created by the lack of formalized data architecture, which includes data modeling, design and development of various data components, internal data flows and external data interfaces, supervisory database management, point of support to development and test teams, and overall organization and management of data- and database-related issues. That is what data architecture is all about. Without a designated and experienced data architect, your project is going nowhere fast in 90 days and counting.

The Data Architect's Role

Data architects work closely with the client or business, other team members, and outside agencies to ensure that the data requirements of your Web site are effectively addressed. In particular, the data architect must work in close cooperation with other members of the architecture team -- the application architect, the business architect, and the technical architect. Between these four roles, the foundation and structure of the Web site -- both business and technical -- will largely be determined. At the beginning and intermediate stages of the project, the data architect must work on designing internal and external data interfaces and interact with other project teams and external data providers. As Web site deployment looms ahead, the data architect will be busy working with the DBA and the Web hosting service to determine the best database deployment strategy and parameters. During the project, the data architect is supporting the development and test teams and shepherding data. (See Figure 1)

Managing Data

Where exactly is the data? Frankly, it's everywhere. It's not static information that a database sits in. Data is what the Web site consumer is seeing and doing. It flows in and out of the database server. It is read into, written from, and manipulated in the application server; passed back and forth through the Web server; and presented to or received from the consumer through the browser. When you say that a Web site is 2437, you really mean that the data must be constantly available for the consumer's use; without data, the Web site is meaningless. When you say that a Web site is slow, you really mean that it's taking a long time for the data to travel from the servers to the browser, or the other way around.

The typical business-to-consumer (B2C) Web application is likely to have many different kinds of data. This data may come in from Web site users, or it may be generated by the middleware, such as Web behavior logs, or from internal or external back-end sources; usually the database server. Back-end data may be internal to the enterprise, such as data associated with other line-of-business software or, in the case of brick-and-mortar companies setting up a Web presence, data from legacy or enterprise resource planning (ERP) systems. Alternately, this data may be acquired from external data providers. (See Figure 2)

Understanding and integrating disparate data is complex and time-consuming, but it is critical to the success of your dot-com project. Two main aspects to this complexity are:

  • Data structures. All data repositories have their own static data representation. If the repository is relational, it includes table structures, indexes, views, column definitions, entity/column constraints, referential integrity dependencies, and so on. But not all data repositories are relational: Web behavior data (statistics) may be held in log files; object-oriented data may be held in object or object/relational databases; and extensible markup language (XML) data needs a hierarchical representation. You need to reconcile and integrate these different data repositories so that they work together in a coherent and complementary manner.

  • Data movement. Once the data structures and data dependencies have been identified, you need to define and then design the data flows in the system. The movement of data from one system to another is typically a complex, multistep process and may involve the design and development of data extraction, transformation, and loading (ETL) processes, similar (but not identical) to those used for loading data into data marts and data warehouses. Or they may require building realtime interfaces to external data providers.

    The data architect must understand all the data requirements and design appropriate data repositories and data movement in order to fulfill all the business's needs.





  • IE Weekly Newsletter
    Subscribe to the newsletter
        Email Address