Contents Under PressureAn examination of the world's largest and most heavily used databases unveils a wealth of information about current scalability challenges. By Richard Winter and Kathy Auerbach April 28, 2004
When it comes to databases, popularity breeds challenges. Even as they amass more data, both OLTP and decision-support systems must get faster and more friendly to intense query activity. In a report from the only survey of its kind, we learn how key trends are shaping choices in technology.
It's a miracle the lights don't dim, the earth doesn't shake, and the sun isn't occasionally darkened by data clouds erupting from a system that just can't take it anymore. Across the globe, every minute of every day, enormous databases are running thousands, if not millions, of simultaneous operations to support online transaction processing (OLTP). Meanwhile, other databases take on massive data warehouses as they respond to the complex number-crunching needs of strategic business applications and their users.
Especially for the largest database-management systems, scalability is forever a work in progress. Boundaries are moving up and out constantly. For some, scalability is about the number of users, simultaneous queries or transactions, and query speed; for others, it is about the sheer amount of raw data the system must store and manage. And for many, it is about all of the above.
For the past 10 years, Winter Corp. has been tracking the world's largest and most heavily used databases. We have developed quantitative insights concerning the demographics, operating characteristics, and practices used to develop and manage large database. In this article, we discuss trends and directions in large databases, as revealed by our most recent survey — the 2003 TopTen Program, conducted from May to October 2003 - as well as other surveys and interactions with user organizations.
The TopTen Program received more than 300 survey results, which originated from 23 countries. Qualifying Windows-based databases had to contain more than 500 GB of data, while databases on all other platforms had to exceed 1 TB. Note that we define database size as the sum of user data, summaries, aggregates, and indexes — not the complement of attached disk. We required respondents to validate their entries by running queries developed by Winter Corp. and associated industry experts. We sorted the entry databases by usage and operating system, and assessed them according to four metrics: size, uncompressed data volume, number of rows/records, and workload. Based on these metrics, we developed category lists of the world's leading databases, along with data points that we are presenting in this article. Steady Growth
The most compelling findings from the data are the growth in database size and workloads. This growth has several dimensions, the most visible and perhaps most daunting of which is size. Large databases may contain thousands of tables and support as many users, who are producing and demanding access to more data than ever before. These users are scattered across the enterprise in geographically diverse locations, and their numbers are rising steadily. Database management software is expected to process data volumes whose sizes dwarf what they handled previously.
Our research shows that database size is accelerating. Figure 1 discloses the range in size of the 10 largest decision-support and transaction-processing databases in the 2003 and 2001 Winter Corp. surveys. In two years since the 2001 program, the largest transaction-processing database almost doubled in size, from 10.5 to 18.3 TB. On the decision-support system (DSS) side, the 10th largest database is nearly the size of the biggest database in the 2001 program. And the largest decision-support database in 2003, at 29.2 TB, is almost triple the size of the 2001 leader.
FIGURE 1 In two years, the largest databases have increased in size two- and threefold.
Increases in row count provide another example of the growth in database scale. Figure 2 shows the growth in number of rows over the past two years. Row count on every platform for both decision support and transaction processing grew significantly since 2001. Unix DSS databases experienced the greatest growth. Their average number of rows grew six times, propelling them past transaction processing z/OS systems to become the new champions of row count.
FIGURE 2 The Unix DSS database average is now almost 40 billion rows.
|
New on the BLOG
Interactive Dashboards: US.Gov Aces, Wimbledon Double Faults
07. 2.2009
Read more from Rajan Chandras >>
Given that BI thought leaders are wrestling with the notion of events, perhaps we will see a BI-mainstreaming of event processing in the not-too-distant future. Interest in streams and events has definitely picked up in the last few months, and next year could very well be the break-out year for BI on data and event streams. 07. 1.2009 Read more from Seth Grimes >> Mulling the Mystery of Microsoft's BI Market Share 07. 1.2009
Read more from Doug Henschen >> Most Popular This Week
Intelligent Enterprise Newsletters
Subscribe Here:
| |||||||||||||||||
| |||||||||||||||||||||||||||||||























