CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





October 30, 2003

In this Issue:

  • Very Large Data Warehouses
  • What's the Utility?
  • In Brief

    Very Large Data Warehouses

    Teradata And Oracle Execs Weigh In

    In separate, ostensibly unrelated conversations with senior editor Jeanette Burriesci, executive technologists from two major database companies spoke about some of the same forces in the data warehouse marketplace. While the topics most on their mind intersected, including the announcements about Oracle 10g at the just-concluded OracleWorld conference, their opinions of Oracle's technology certainly did not. Depending on your perspective, it might be said that Teradata feels defensive — protective of its VLDB territory — or that Oracle is overconfident.

    As the typical scale of data warehouses rapidly increases, in terms of the volume and variety of data and the number and variety of users, database vendors are eager to give customers ways to save on incremental costs.

    These statements come from John Entenmann, VP of Business Intelligence at Oracle, and Stephen Brobst, CTO of Teradata.

    On Grid Computing and BI

    Brobst and Entenmann agree that grid computing is a good concept for more efficient use of computing capacity, but disagree about its readiness.

    Entenmann: The grid technology [in version 10g of the Oracle Database, due out this year] applies to BI as much as it does anywhere else. To give you an example of what typically happens when people deal with BI and a transactional system: A lot of times, people will start with a transactional system and they'll enter all their orders there — building up their information. Then they'll want to drill in... so they'll run a BI system against it. The problem is, when they start running those queries in the BI system, it loads the system that they're also using to enter orders. This is not good. So what you've got to do is break this information out into another environment — like a data warehouse. Both [systems] are their own computer. That can work, but typically you'll find you're getting variable loads on both — yet you've configured for the maximum load on each. And so BI would be a classic example of where you can use the grid to get the power to do BI when you need to do BI — when there's a query running, or at night when you're doing your build — and not have to firewall or reserve the resources strictly for that function. So it's one of the better examples of how the grid can be used — to do things on a timed basis [saving money on hardware costs].

    Consolidation Corner

    Rivals Ascential and Informatica Expand. Ascential Software Corp. has acquired enterprise application integrator Mercator Software Inc. "The acquisition of Mercator further expands Ascential's Enterprise Data Integration Suite and creates the industry's first and most comprehensive data integration product set for transactional, operational, and analytical requirements, regardless of data volumes or latency. This uniquely enables our customers to apply data integration solutions pervasively throughout the enterprise," said Peter Gyenes, Chairman and CEO of Ascential.

    Also, Ascential rival Informatica will acquire Striva Corp. Striva provides patented mainframe integration solutions that Informatica has sold under an OEM license for more than two years. Informatica states that market reports show 90 percent of Fortune 1,000 companies still use mainframes with mission-critical applications, and more than 70 percent of corporate data is stored on mainframes.

    Informatica and Ascential are both major players in the extract, transform, load (ETL) market. Informatica backed away this year from a strategy to expand into the higher-level market of analytic applications. Informatica's expansion into the mainframe space represents a more pragmatic strategy. Ascential's acquisition of Mercator goes only slightly higher-order than ETL — into the application integration space.

    Vignette Gets Intraspect. Vignette Corp. entered a definitive agreement to acquire privately held Intraspect Software Inc. Vignette states that the combination of Vignette's content management and portal products with Intraspect's collaboration (such as workflow and content life-cycle management) offerings will provide customers with the industry's most advanced enterprise software foundation. Many vendors, such as BEA and IBM, would probably disagree.

    Brobst: We have R&D efforts looking at the grid computing technologies. But to date, grid computing has been more marketing hype than how we [Teradata] would define good computing, [which] relates to the ability to farm out the resources to a collection of computing agents. And in a large-scale database environment, the issue isn't farming out the computing cycles. The issue is providing access to the data. I mean, in the Teradata environment, it's not unusual for people, from an I/O throughput perspective, to move hundreds of terabytes per day across their I/O subsystem. In a grid computing environment, the infrastructure isn't yet mature enough to handle that kind of data movement. We don't believe that the technology is mature enough for robust products at the scale that our customers demand. When I say scale I also mean availability and reliability and those kinds of things.

    The I/O subsystems and the availability to guarantee service levels for computing in the grid [lie outside the DBMS vendor's control]. In a parallel computing environment, you've got many, many agents — what we call AMPs, or access module processors. These are computing agents. We've got thousands of them all working together. Whoever is the slowest, slows down everybody. And so it's very important to have guaranteed service levels for delivering the computing cycles as well as the I/O capability and underlying system. Those are not ready for prime time quite yet...

    It's hard to predict [when grid computing technology will accommodate large-scale BI efforts], but [it will be] years, not months [from now].

    On Reducing Incremental Costs of Ever-Expanding Data Warehouses

    Both Brobst and Entenmann recognize maintenance as the most expensive component of total cost of a database system. Brobst criticizes Oracle's state of development with respect to automated management. Entenmann provides information about Oracle's attempts to address this need.

    Brobst: ..We believe that "database management system" means the database should manage the system, not the human. If you look at databases historically, the number of DBAs necessary to manage the database, particularly related to the storage management and so on — they increased nearly linearly proportional to the size of the database. Now imagine what that means if you've got petabyte-scale databases. That's pretty unacceptable. Because that means you have an army of DBAs to manage the environment. So from a scalable database technology perspective, if you focus on this area of management, you have to eliminate tablespaces. You have to eliminate extents, you have to eliminate defrag commands and reorg commands. Those things just have to be gone.

    To get to the other dimensions of scalability, and get to specifics, I talked about this notion of building petabyte-scale databases. Well, you need to provide access not to just one query at a time, running batch queries. You need to be able to have load-management capabilities that allow thousands of queries to be running.

    Ultimately, the databases have to be smart enough to manage those resources. And to date, 3 databases have done it. The others either need to do it or relegate themselves to second-tier status. The Tandem Nonstop SQL database has had pretty sophisticated workload management for quite a while. DB2 on the mainframe has had workload management for quite a while, as has Teradata. But, for example, DB2 on Unix doesn't have this capability. Nor does Oracle in any meaningful way. And when I say "meaningful," I mean you're doing dynamic resource allocation — so readjusting resources allocated for queries, dynamically, based on what the current workload is [that's] running on the system. In Oracle — the new version that was just announced last week — they started introducing some of these concepts but they're quite a bit behind. And DB2 on Unix seems to almost be ignoring it.

    Entenmann: In my conversations with BI customers ... It's a very fast-growing space and there's a lot of success there, but the biggest concern or the biggest problem they face isn't setting up the system. It isn't the features of the system. It's keeping it running.... It's not only common; it's universal that the BI system gets destabilized constantly. We've done a lot with Warehouse Builder [an ETL design and management tool] to ... automatically reconcile changes [in source systems] and keep transformations working. I haven't seen [all ETL products], but I believe that we're farthest ahead in that space.

    The database itself also has a great wealth of — with 10g — auto configuration and auto-tuning features. So the amount of attention that you have to spend on the warehouse has gone down because the database requires less management.

    It's part of this overall message of keeping things easier to manage — no management if possible — reducing probably the biggest cost in any deployment: It isn't the software license; it isn't the hardware; it's the people needed to run the system. So we're making changes to make that possible — without reducing the functionality and power. Usually in the industry you get this kind of tradeoff: Some people can make things very simple, but in the process they greatly reduce the functionality — to the point where you can't do much with it. On the other side of it, people give you great power, but it's so complex you need to hire a team of people to get it going. Oracle has great power; one of the best in the industry, but now we're making it simple across the board.

    In this Issue:

  • Very Large Data Warehouses
  • What's the Utility?
  • In Brief









  • IE Weekly Newsletter
    Subscribe to the newsletter
        Email Address