|
Data Frontiers, by Curt Monash
Curt Monash runs Monash Research, which provides strategic, analysis-based advice to users and vendors of advanced information technology. He also writes the blogs DBMS2, Text Technologies, and Strategic Messaging. See More by Curt Monash
E-MAIL |
Tradeoffs In Splitting DBMS Work Among MPP Nodes
I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes: A boss node, whose jobs include: In primitive forms of this architecture, there's a "fat head" that does altogether too much aggregation and query resolution. In more mature versions, data is shipped intelligently from worker nodes to their peers, reducing or eliminating "fat head" bottlenecks. Exceptions to the base case include Vertica and Exasol. In their systems, all nodes run identical software. At the other extreme, some vendors use dedicated nodes for particular purposes. For example, Aster Data famously has special nodes for bulk data loading and export. Greenplum has a logical split between nodes that execute queries and nodes that talk to storage, and is considering offering the option of physically separating them in a future release. The basic tradeoffs between these schemes go something like this: • If there are more kinds of dedicated nodes, real-time load-balancing is harder; you're more likely to have idle capacity. Calpont, which hasn't actually shipped a DBMS yet, has an interesting twist. They're building a columnar DBMS in which the querying work is split between a kind of worker node, which does the query processing, and a storage node, which talks to disk. These nodes are not in any kind of one-to-one correspondence; any worker node can talk with any storage node. Calpont believes that in the future some of the storage node logic can migrate into storage systems themselves, in almost a Netezza-like strategy, but on more standard equipment. The Calpont story may actually make more sense in a shared-disk storage-area-network implementation than for a fully shared-nothing MPP, but that's a subject for a different post.
E-MAIL |
This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers. Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service. Important Note: This comment area is NOT intended for commercial messages or solicitations of business.
|
Blog Channels
The Brain Food Blogger SQL Puzzlers by Joe Celkoon Enterprise App Development on Changing the Enterprise by Shawn Shell by Kas Thomas Strategic Knowledge, by Dave Stodder Product Maven Subscribe to RSS feed of all blogs Archives
|
| |||||||||||||||||||||||||||||||
























