CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise



April 10, 2000 Volume 3 - Number 6



E-Commerce X-Factor

E-business survival can depend heavily on leveraging existing data sources with speed and reliability. XML data servers offer an intriguing option for meeting that goal

By Douglas Barry



The jury is still out on the result, but some of the early reports are not too cheery. From the e-commerce holiday customer’s perspective, all is not well. As summarized in the Dec. 31, 1999, issue of Time magazine: “The final days of the Internet’s first really big Christmas were punctuated by a mountain of undelivered packages and a blizzard of complaints: computers that crashed, orders that vanished, items suddenly out of stock or stuck in the warehouse.”

Obviously, no company can afford to be without an e-commerce presence. But just having it is not enough; it has to work, and work well. Customers will not depend on a “service” that doesn’t meet their needs, whether in consumer sales or business-to-business situations.

What can you do to improve your company’s odds in the e-commerce arena? One answer to that question may be extensible markup language (XML). Your organization can use XML in several ways: as a universal data communication format, as a more intelligent presentation format than HTML, and as a format to store data.

XML is also probably the most hyped acronym in the technology lexicon right now, causing great confusion. It seems that nearly every day we hear an announcement about a new product that takes advantage of XML, but it can be difficult to see how these various products compare.

In this article, I’ll clarify the business and technical value of one important aspect of XML technology — the XML data server — and explain how your organization can use it as underlying infrastructure for XML-based e-commerce applications.

XML Data Server Architecture

Using an XML data server is a strategic decision. It enables faster development of your Web site and e-commerce application, as well as faster response time on the site itself. Indeed, the key word here is speed. Customers will no longer tolerate a slow, unresponsive e-commerce experience, so those companies willing to settle for providing one should prepare to eat their competitors’ dust.

The obvious answer to the question of how to speed server response is to use faster and more hardware platforms. But if you’ve looked into that option, you can see that it gets expensive and may not solve the problem entirely. The alternative is faster software. That’s where XML data servers come in — they provide a software architecture that speeds up the response time of your site while providing XML interfaces as they are needed.

The XML data server architecture provides data storage, an application layer for data manipulation, and presentation of the XML data in the middle tier. (See Figure 1) Using an XML data server is much like publishing a catalog. Instead of publishing to print, however, you publish to the XML data server. The data server, in turn, sees to it that the “catalog” data is presented in the customer’s desired data format — of which several variations, or document type definitions (DTDs), are possible. XML data servers can also provide the data in plain old HTML if that’s what the customer needs.

Figure 1 XML data server architecture.


XML data server architectures fall into three basic categories: standalone, those using data from existing sources, and multisite. I’ll discuss the second category because XML data servers are most commonly applied to that situation.

Your organization should use this architecture if it has any data sources that contain information about products or services. The data in these sources is probably stored in a file system or DBMS of some kind. Adding an XML data server to the middle tier integrates such existing systems, along with other data such as images and graphics, in much the same way a catalog is published. (See Figure 1.) In these cases, it doesn’t make sense to convert the existing product data to an XML format, because other applications may rely on the current data format. The images and graphics may be stored on completely different systems, such as a designer’s PC. Also, it may be undesirable to expose existing data sources to the ad hoc nature of Internet queries, which can wreak havoc on data processing operations.

Keep in mind that the XML data server sits underneath a Web server. If your company supports online payment, the Web server usually provides the connection to a merchant system.

All in the Presentation

In this software architecture, a middle-tier XML interface is responsible for data presentation. This interface presents information, including DTDs and extensible style language (XSL), to the Web server. The DTDs define the tags that “mark up” the document, providing a kind of grammar for a class of documents, and you use XSL to create written instructions that explain how a document should look when displayed. XSL often contains information such as font size and type, spacing, and other design details. Thus, the final presentation is determined by the interplay of the Web server with XML, DTDs, and XSL. (See Figure 2)

Figure 2 Using a locally defined DTD and XSL.


It is certainly an option with XML to create a “closed system” in which you control all the specifications. This approach affords total control of the DTD along with the XML data and the XSL used to specify the presentation. The downside to creating a closed system is that the growing number of automated tools on the Internet may not understand the meaning of the data. These tools include search engines and various “bots” for shopping or comparisons. XML is structured as a series of name-value tagged pairs, such as <product description = “Cardigan Sweater”>, where product description is a tag and “Cardigan Sweater” is a value. Systems using standard names, for example, may expect a product tag instead of the product description tag in this example. Several industry groups are working on agreements as to the name and meaning of XML tags.

Keep in mind that if you are concerned about agreement on the tags used, you’ll need to use a DTD and XSL that are either industry standards or agreed on by all interested parties. When you use such standard specifications, information presentation requires only the XML file. (See Figure 3) Comparing Figure 2 to Figure 3, you can see that the process is obviously simpler when an agreed-on DTD exists. A standard DTD would look much like the one at the bottom of Figure 2; the difference being that the interested parties have agreed on the name-value tagged pairs.

Figure 3 Using a standard DTD and XSL.


Obviously, adopting an industry- standard DTD would be the most elegant approach. But sometimes a single DTD is not the answer. What happens if more than one standard DTD exists? What if you need a custom DTD in addition to the standard DTD? For example, in Figure 3, the tag <color_ swatch> may not appear in a standard DTD, but you could use the tag <color> instead. Furthermore, the standard tag <color> might not have an “image” attribute that can reference JPEG files.

One approach is to use one DTD for the industry standard and another for custom presentation. Using our example, this technique would be helpful when you want to display the images of color swatches even though the industry-standard DTD doesn’t support that capability. One way to do so is to separate presentation format from storage format, which you can do using a DBMS. (More on that later.)

Guts of the Application

For most uses of XML data, an application layer will enable processing in addition to presentation. Most applications using XML are written in Java or C++, which are object programming languages. So it is important to consider how you use objects in the application layer.

You have several options for objects at the application level. They range from simple individual programs to complex applications that incorporate CORBA, COM+/MTS, or Enterprise JavaBeans (EJBs). For the purposes of this article, however, I’ll consider all of them “application objects”; comparing the relative merits of various object models is beyond our discussion.

One option for using XML in application objects is to store it directly in its native format. This approach does not convert the XML in any way. The advantage of using XML directly is that no mapping occurs; the disadvantage is that it “breaks” the object model because the XML data is stored separately. There may, however, be good reasons for doing so; for example, if you have no need to encapsulate the XML data in the objects, separating XML from them makes sense.

However, if you have a reason to encapsulate XML data in application objects, this approach is not a good one. Rather, encapsulation brings the XML data into an object model and makes manipulating the data much easier. In this case, XML is used only for presentation and communication. The advantage here is that the object model is kept intact; the disadvantage is that the XML must be mapped between the application and interface layers. This mapping issue is insignificant, however, because of the similarities between XML and the object models in programming languages such as Java and C++.

Data Storage

You can store XML data in three ways: in the file system, in a DBMS, or in native storage. I’ll focus on storage in DBMSs because of their ubiquity; furthermore, DBMSs provide the ACID properties (atomicity, consistency, isolation, and durability) needed to ensure safe and secure multiuser access and updates. Three DBMS options exist: relational (RDBMSs), object (ODBMSs), and object/relational (OR) mapping-based products.

RDBMS-based data servers. An RD- BMS-based data server will provide significant improvement over a file-based system when high-performance queries and update activity are involved, particularly if the site is large. In this approach, the data server uses an RDBMS in the middle tier. (See Figure 1.) Data from the existing databases or file systems is translated into the RDBMS’s format. All query and update activity goes directly against the RDBMS, not the existing data sources. If update activity occurs, one option is to write it back to the existing data sources at some interval. One example would be writing the updates at the end of a business day.

An important consideration when deciding to use an RDBMS with XML data is the difference between the XML model and the relational model in the middle tier. The XML model is essentially an object model, whereas the RDBMS obviously uses a relational model. This fact gives rise to impedance mismatch and can necessitate a significant mapping layer, which involves extra development time and can slow performance down.

ODBMS-based data servers. Many of the same database management principles apply to RDBMS-based and OD-BMS-based XML data servers. The crucial difference between the two systems is that the XML model is similar to the object model, whereas the relational model is not. The ODBMS-based data server architecture is particularly interesting for those scenarios involving a large or complex site where performance is an important requirement or you expect a high volume of activity. In those situations, the closer match between the XML model and the object model will provide better performance than an RD-BMS-based data server.

In this approach, the data server uses an ODBMS in the middle tier. Data from the existing databases or file systems translates into the format used by the ODBMS. There is no need for a mapping layer between the data storage in the middle tier and the application layer, because objects are used in both layers — the objects the application needs are stored directly in the ODBMS. Thus, the absence of a mapping layer between the application layer and data storage greatly improves performance over an RDBMS-based data server.

Furthermore, all ODBMS-based data servers support object caching, which can greatly speed up response times for sites that repetitively use the same objects. In this architecture, query and update activity goes directly against the ODBMS and not the existing data sources. If updates occur, one option is to write that activity back to the existing data sources at some interval. Again, one example would be writing the update activity at the end of a business day.

OR mapping-based data servers.

This approach differs from the other methods in that no data is stored in the middle tier. Consequently, any change will go directly to the underlying database or files. In the other two approaches, transformations occur on a batch basis.

This approach offers a major advantage: Any changes to the underlying files or database are immediately available to customers on the Internet. Of course, the flip side is that doing the transformations to the XML structure in real time is expensive in time and resources. It will result in lower performance.

The impedance mismatch I mentioned earlier occurs in this architecture as well. The caching supported in OR mapping-based data servers, however, can lessen this problem if cached data is read relatively frequently. As a rule of thumb, if the same data item is read repetitively 10 to 100 times, this architecture's speed can approach that of ODBMS-based data servers.

Nearly all OR mapping-based data servers support caching that is similar to the object caching in ODBMSs. A new twist to consider, however, is that of cache synchronization. Cached data can become “old” if the data is changed in the underlying RDBMS. Cache synchronization comes into play by keeping the cached data up-to-date with any changes in the underlying RDBMS.

Be Prepared

With the many options available, it's difficult to know what the correct choice should be for your e-commerce needs. As far as XML is concerned, however, you should consider the features I've described here when looking for a way to provide a highly responsive Web site that leverages existing data sources.

Douglas K. Barry (doug@barryandassociates.com), principal of Barry & Associates, has worked in database technology for over 20 years, with an exclusive focus on the application of database technology for objects since 1987.



 





IE Weekly Newsletter
Subscribe to the newsletter
    Email Address