Guide to the TechWeb Network

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Whitepapers
Subscribe
Home




May 28, 2002

Prime Time for XML

Open up new application types with XML embedded in your relational databases

By Steve Tracy

Continued from Page 1

This XML text is stored directly in a single field in the customer dimension in exactly the format Listing 1 shows. It is extensible because you simply update the Customer Behavior Element for each period. By adding "cluster change," you can embed a comparison of the last added code to the current one. This DTD is reasonably straightforward and compact; you can store it in one cell in a row, as character data.

Obviously, in this form, in order to pose queries you need some assistance either from the relational database system or from the requesting query tool. All the major database vendors are moving aggressively to support XML data structures in their existing database systems. For example in Oracle 9i, the example column in the customer dimension is declared as an "XMLType," and Oracle's SQL is extended to allow querying arbitrary XML structures stored in this way. Microsoft approaches the problem in SQL Server by extending SQL-92 and by introducing an OPENXML row set. Access 2002 has an XML native data type — albeit a Microsoft version not quite in line with W3C standards. Finally, IBM's DB2 offers an "XML Extender," which allows storing XML documents either as BLOB-like objects or as decomposed into a set of tables.

You can gain somewhat more functionality and precision using the XML schema approach. XML schemas require a separate declaration of the database structure from the XML content. You could store the XML schema declaration offline as ordinary metadata. Or, for clarity, you might want to physically store the XML schema in a column adjacent to the data. You're using a few more characters, but you're still using only two columns in the customer dimension.

Regardless of the specific XML approach you use, you gain a lot of functionality. You can:

  • Explicitly state the number of periods, and easily change the value in the schema
  • Accommodate multiple periods without adding database table columns
  • Explicitly identify cluster changes and add new comparison metrics
  • Query with XML-aware SQL extensions directly from existing query tools
  • Query with SQL, as a character field, parsing the results to open text for further use
  • Combine our query results into other XML information external to the data warehouse
  • Present the information in a browser in a variety of formats with XSLT style sheets
  • Present complex information, without excessive proprietary semantic translation.

Text Goes to Graduate School

The foregoing changes make some improvements to how we handle text in the customer dimension, but to realize the full potential of XML in our DBMS-oriented world, we need XML to act like any other data type. If we query plain character data stored in XML format, we need a slightly different query language, and an extra step for parsing, or a special feature in the database to help us out.

Fortunately, as I described, the DBMS vendors are accommodating our requirement with support for XML as a native database data type. Leveraging this feature, we can open our information up to a wider audience:

  • Many new applications involving complex structured text can now be brought into the existing data warehouse world dominated by relational databases.
  • DBAs without extensive XML knowledge can work with their familiar SQL syntax and tools.
  • Application developers can access relational data through non-SQL methods, using XML query techniques such as XPath or XML-Query, designed to process hierarchical data by interpreting the relationships implicitly by position in the XML "document."
  • Using ETL tools with the ability to interpret XML files with definitions and turn them into segmented files for data mining, we derive even more value out of the structure.

But are We Ready?

Major new technology innovations seem to go though at least three phases. The XML revolution is a good example. The three phases are:

1) Breathless descriptions of the promised advantages but no vendor support for anything (yet)

2) Considerable vendor support for a stable standard, but each vendor's offering isn't quite there (yet)

3) Highly standardized industrywide support with many mature third- and fourth-generation products.



Rate This Article

Comments:

Optional e-mail address:

In my opinion, we're solidly in phase 2. That's why I'm comfortable recommending that you dive in to an XML application, maybe like Ralph's customer behavior example. The application benefits and the acquired learning you will gain today should easily be worth the investment.

The author wishes to thank his coworker, Bruce Casey, for his assistance with this column.


Guest columnist Steve Tracy (steve.tracy@hartfordlife.com) is an assistant director of information delivery for The Hartford Life Insurance Company and has built production data warehouses in the healthcare, retail, and environmental engineering sectors.


RESOURCES

For a good overview of XML terms relevant to the data warehouse, see the April 16, 2001 Data Webhouse column, "XML Will Make It Easier": www.intelligententerprise.com/010416/webhouse1_1.jhtml








IE Weekly Newsletter
Subscribe to the newsletter
    Email Address







InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space