Data Frontiers, by Curt Monash Curt Monash runs Monash Research, which provides strategic, analysis-based advice to users and vendors of advanced information technology. He also writes the blogs DBMS2, Text Technologies, and Strategic Messaging. Write him at contact@monash.com Google Announces Fusion Tables Google has announced an experimental cloud-based data management system called Fusion Tables. A press article and Slashdot thread ensued, based on some bizarre-sounding analyst quotes that I will not attempt to parse. >>Continue reading "Google Announces Fusion Tables " Posted Monday, June 15, 2009 9:10 AM >>Comments Greenplum's Announcement and the Future of Data Marts Greenplum is announcing today a long-term vision, under the name Enterprise Data Cloud (EDC). Key observations around the concept -- mixing mine and Greenplum's together -- include:
In essence, Greenplum is pitching this story: >>Continue reading "Greenplum's Announcement and the Future of Data Marts" Posted Monday, June 8, 2009 9:20 AM >>Comments Reinventing Business Intelligence I've felt for quite a while that business intelligence tools are due for a revolution. But I've found the subject daunting to write about because -- well, because it's so multifaceted and big. So to break that logjam, here are some thoughts on the reinvention of business intelligence technology, with no pretense of being in any way comprehensive. Natural language and classic science fiction Actually, there's a pretty well-known example of BI near-perfection -- the Star Trek computers, usually voiced by the late Majel Barrett Roddenberry. They didn't have a big role in the recent movie, which was so fast-paced nobody had time to analyze very much, but were a big part of the Star Trek universe overall. Star Trek's computers integrated analytics, operations, and authentication, all with a great natural language/voice interface and visual displays. That example is at the heart of a 1998 article on natural language recognition I just re-posted. >>Continue reading "Reinventing Business Intelligence" Posted Tuesday, June 2, 2009 9:41 AM >>Comments More on MySQL Forks and Storage Engines The issue of MySQL forks and their possible effect on closed-source storage engine vendors continues to get attention. The underlying question is: Suppose Oracle wants to make life difficult for third-party storage engine vendors via its incipient control of MySQL? Can the storage engine vendors insulate themselves from this risk by working with a MySQL fork? As laid out most clearly in a comment thread to a previous post*, Mike Hogan (CEO of ScaleDB) believes closed-source storage engine vendors can use a MySQL fork without running afoul of the GPL. In a nutshell, what he proposes is an inbetween layer of software, itself open-sourced, that on one side interfaces with MySQL, and on the other side talks cleanly enough to storage engines that it doesn't infect them with the GPL. >>Continue reading "More on MySQL Forks and Storage Engines" Posted Tuesday, May 26, 2009 9:30 AM >>Comments The Real Story on IBM's System S Release IBM hastily announced System S Streams this week, a product that was supposed to be called InfoSphere Streams and introduced only in 2010. Apparently, the rush is because senior management wanted to talk about it later this week, and perhaps also because it was implicitly baked into some of IBM's advertising already. Scrambling ensued. Even so, Jeff Jones and team got to me fast, and briefed me -- fairly non-technically, unfortunately, but otherwise how I like it, namely on a harmless embargo and without any NDAs. Microsoft also introduced CEP this week. Perhaps it is more than coincidence that IBM rushed out its own announcement of an immature CEP technology immediately after Microsoft revealed its plans. Taken together, these announcements support my theory that the small independent CEP/stream processing vendors are more or less ceding broad parts of the potential stream processing market. >>Continue reading "The Real Story on IBM's System S Release" Posted Friday, May 15, 2009 9:53 AM >>Comments eBay's Enormous Data Warehouses Detailed A few weeks ago, I had the chance to visit eBay, meet briefly with Oliver Ratzesberger and his team, and then catch up later with Oliver for dinner. I've already alluded to those discussions in a couple of posts, specifically on MapReduce (which eBay doesn't like) and the astonishingly great difference between high- and low-end disk drives (to which eBay clued me in). Now I'm finally getting around to writing about the core of what we discussed, which is two of the very largest data warehouses in the world. >>Continue reading "eBay's Enormous Data Warehouses Detailed" Posted Friday, May 1, 2009 9:38 AM >>Comments It's Time to Strengthen MySQL Forkers As my first three posts on the Oracle/Sun merger suggested, I think Oracle will do a better job with MySQL product development than Sun has. But of course that's a low hurdle. And so it leaves open the questions: What should and/or will be the most widely adopted code lines of MySQL (or other open source DBMS), especially for the types of users and vendors who are engaged with MySQL (as opposed to principal alternative PostgreSQL) today? >>Continue reading "It's Time to Strengthen MySQL Forkers" Posted Tuesday, April 21, 2009 11:51 AM >>Comments First Thoughts on Oracle Acquiring Sun
>>Continue reading "First Thoughts on Oracle Acquiring Sun " Posted Monday, April 20, 2009 11:21 AM >>Comments Notes On Inforsense, Tableau, Jaspersoft and More I keep not finding the time to write as much about business intelligence as I'd like to. So I'm going to do one omnibus post here covering a lot of companies and trends, then circle back in more detail when I can. Top-level highlights include:
>>Continue reading "Notes On Inforsense, Tableau, Jaspersoft and More" Posted Monday, April 6, 2009 10:27 AM >>Comments SAS Enters Its Own Cloud The Register has a fairly detailed article about SAS expanding its cloud/SaaS offerings. I disagree with one part, namely: SAS may not have a choice but to build its own cloud. Given the sensitive nature of the data its customers analyze, moving that data out to a public cloud such as the Amazon EC2 and S3 combo is just not going to happen. And even if rugged security could make customers comfortable with that idea, moving large data sets into clouds (as Sun Microsystems discovered with the Sun Grid) is problematic. Even if you can parallelize the uploads of large data sets, it takes time. But if you run the applications locally in the SAS cloud, then doing further analysis on that data is no big deal. It's all on the same SAN anyway, locked down locally just as you would do in your own data center. I fail to see why SAS's campus would be better than leading hosting companies' data centers for either data privacy/security or data upload speed. Rather, I think major reasons for SAS building its own data center for cloud computing probably focus on: >>Continue reading "SAS Enters Its Own Cloud " Posted Wednesday, March 25, 2009 9:48 AM >>Comments Database Implications if IBM Acquires Sun Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal if it happens might affect the database management system industry.
>>Continue reading "Database Implications if IBM Acquires Sun " Posted Thursday, March 19, 2009 10:14 AM >>Comments Complex Event Processing Vendors Flounder Independent CEP (Complex/Event Processing) vendors continue to flounder, at least outside the financial services and national intelligence markets.
>>Continue reading "Complex Event Processing Vendors Flounder" Posted Wednesday, March 18, 2009 10:11 AM >>Comments Quick Take on Microsoft SQL Server Fast Track Stuart Frost of Microsoft (nee' DATAllegro) checked in, with Microsoft's TDWI-timed announcements. The news part was something called "SQL Server Fast Track," which is the Microsoft SQL Server equivalent to Oracle's "recommended configurations" or IBM's "BCUs." SQL Server Fast Track is further being portrayed as an incremental step toward Madison, Microsoft's future high-end data warehousing offering. >>Continue reading "Quick Take on Microsoft SQL Server Fast Track" Posted Monday, February 23, 2009 3:31 PM >>Comments Analytics' Role in a Frightening Economy I chatted the other day with an executive on the general business side (as opposed to the trading operation) of a household-name brokerage firm, one that's in no immediate financial peril. It seems their #1 analytic-technology priority right now is changing planning from an annual to a monthly cycle.* That's a smart idea. While it's especially important in their business, larger enterprises of all kinds should consider following suiy. *By the way, they seem to want use Applix technology, now owned by IBM/Cognos, to do it, more for the planning tools than for the cool in-memory OLAP engine itself. Your mileage may vary. >>Continue reading "Analytics' Role in a Frightening Economy" Posted Monday, February 9, 2009 2:41 PM >>Comments Why BI is in a Funk I wrote recently that BI is in a "funk." Let me now offer a few ideas as to why that is so. 1. At its heart, BI is an application development technology, and making money from innovating in development is hard. To quote myself: Products are obsolete before they [are] mature. Products commonly do only part of what is necessary. Generally, a new tool will be developed to help with a new need... But these tools will often be weak at what came before... By the time the shiny new tools mature to do a good job at the older requirements, some other... shift comes along, with yet newer and shinier tools to handle the latest twists. >>Continue reading "Why BI is in a Funk" Posted Friday, January 30, 2009 11:42 AM >>Comments Don't Let Gartner's Data Warehouse Magic Quadrant Confuse You Gartner's latest Magic Quadrant for data warehouse DBMSs was published last last year. Thankfully, vendors don't seem to be taking it as seriously as usual, so I didn't immediately hear about. (I finally noticed it in a Greenplum pay-per-click ad.) Links to Gartner MQs tend to come and go, but as of now here are two working links to the 2008 Gartner Data Warehouse Database Management System MQ. My posts on the 2007 and 2006 MQs have also been updated with working links. Highlights of this year's data warehouse DBMS Magic Quadrant include: >>Continue reading "Don't Let Gartner's Data Warehouse Magic Quadrant Confuse You" Posted Wednesday, January 21, 2009 10:50 AM >>Comments How to Buy an Analytic DBMS I went to London for a couple of days recently, at the behest of Kognitio. Since I was in the neighborhood anyway, I visited their offices for a briefing. But the main driver for the trip was a seminar Thursday at which I was the featured speaker. As promised, the slides have been uploaded here. The material covered on the first 13 slides should be very familiar to readers of this blog. I touched on database diversity and the disk-speed barrier, after which I zoomed through a quick survey of the data warehouse DBMS market. But then I turned to material I've been working on more recently practical advice directly on the subject of how to buy an analytic DBMS. I started by proposing a seven-part segmentation self-assessment: >>Continue reading "How to Buy an Analytic DBMS" Posted Monday, December 22, 2008 5:54 AM >>Comments Hot Topics in High-Performance Analytics For the past few months, I've collected a lot of data points to the effect that high-performance analytics i.e., beyond straightforward query is becoming increasingly important. And I've written about some of them at length. For example:
Ack. I can't decide whether "analytics" should be a singular or plural noun. Thoughts? Another area that's come up which I haven't blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: >>Continue reading "Hot Topics in High-Performance Analytics" Posted Monday, November 17, 2008 10:03 AM >>Comments Getting to Answers on Oracle's New Hardware I spent about six hours at Oracle last week talking with Andy Mendelsohn, Ray Roccaforte, Juan Loaiza, Cetin Ozbutun, et al. and plan to write more later. For now, let me pass along a few quick comments. The key philosophical point that I had perhaps been missing is that Oracle thinks there is and should be a storage (server) tier, just as there also are database (server), application (server), and web (server) tiers. Exadata cells are designed to never talk with each other. Instead, they talk to a set of Infiniband switches, which then talk to a grid of servers on the database tier. Oracle thinks this has solved its I/O bandwidth problem for once and for all. It's hard to see why that wouldn't be the case. What Exadata does on the storage tier in query execution is throw stuff away. Mainly, this is projection and restriction/SELECT. But if a join has been resolved on a small fact table, and Oracle is now filtering a fact table to match a value or set of values, the storage tier can do that too. >>Continue reading "Getting to Answers on Oracle's New Hardware" Posted Wednesday, October 22, 2008 12:27 PM >>Comments A Quick Guide to Teradata's Latest News The Teradata Partners (i.e., user) conference is this week. So there have been lots of press releases, some presentations, lots of meetings, and so on. A lot of Teradata's messaging is in flux, as it moves fairly rapidly to correct what I believe have been some deficiencies in the past. One confusing result is that there was very little prebriefing about the actual announcement details, and we're all scrambling to figure out what's up. Teradata does a good job of collecting its press releases at one URL. So without linking to most of them individually, let me jump in to an overview of Teradata news this week (whether or not in actual press release format): >>Continue reading "A Quick Guide to Teradata's Latest News" Posted Tuesday, October 14, 2008 11:51 AM >>Comments HP-Oracle Appliance Prices Estimated I've been trying to figure out how much the HP-Oracle Database Machine and HP-Oracle Exadata Storage Server actually cost. My first estimate was $58-190K/TB (user data), but I've since updated my pricing spreadsheet. Specifically: The first page of these estimates have been modestly altered to reflect more chargeable software options, as per the discussion below. >>Continue reading "HP-Oracle Appliance Prices Estimated" Posted Friday, October 3, 2008 1:11 PM >>Comments HP-Oracle Hardware Parallelization Clarified Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn't stand in query and analytic parallelization. Let's start with the part everybody pretty much knows already: There are two parts to a parallelization story how you get data off of disk, and what you do with it once you have it. >>Continue reading "HP-Oracle Hardware Parallelization Clarified" Posted Wednesday, October 1, 2008 11:49 AM >>Comments Oracle Finally Answers Data Warehouse Challengers Oracle, in partnership with HP, has announced a new data warehouse appliance product line, cleverly branded "Exadata." The basic idea seems to be that database processing is split among two sets of servers: (The new stuff) A set of back-end servers the Oracle Exadata Storage Servers that gets data off of disk and does some preliminary query processing. Numbers are being thrown around suggesting that, unlike prior Oracle offerings, the Exadata-based appliance at least has scalability and price/performance worth comparing to Teradata hey, Exa is bigger than Tera! Netezza, et al. >>Continue reading "Oracle Finally Answers Data Warehouse Challengers" Posted Thursday, September 25, 2008 1:49 AM >>Comments Vertica Spells Out Compression Claims Omer Trajman of column-store DBMS vendor Vertica put up a must-read blog spelling out detailed compression numbers, based on actual field experience (which I'd guess is from a combination of production systems and POCs): >>Continue reading "Vertica Spells Out Compression Claims " Posted Wednesday, September 24, 2008 12:54 PM >>Comments Infobright Open Source Move Packs Potential Infobright announced today that it's going full-bore into open source specifically in the MySQL ecosystem with the licensing approach, pricing, distribution strategy, and VC money from Sun that such a move naturally entails. I think this is a great idea, for a number of reasons: >>Continue reading "Infobright Open Source Move Packs Potential" Posted Monday, September 15, 2008 11:16 AM >>Comments Tradeoffs In Splitting DBMS Work Among MPP Nodes I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes: A boss node, whose jobs include: >>Continue reading "Tradeoffs In Splitting DBMS Work Among MPP Nodes" Posted Tuesday, September 9, 2008 12:16 PM >>Comments Why MapReduce Matters to SQL Data Warehousing Greenplum and Aster Data have both just announced the integration of MapReduce into their SQL MPP data warehouse products. So why do I think this could be a big deal? The short answer is "Because MapReduce offers dramatic performance gains in analytic application areas that still need great performance speed-up." The long answer goes something like this. The core ideas of MapReduce are: >>Continue reading "Why MapReduce Matters to SQL Data Warehousing" Posted Thursday, August 28, 2008 8:53 AM >>Comments David Raab Offers Kudos for QlikView David Raab is a great fan and former reseller of QlikTech's QlikView. His recent lengthy post about the product (I hesitate to call it "detailed" only because he rightly observes that QlikTech is in fact stingy with technical detail) is positive enough to have been recommended by the company itself. Specifically, it was cited in the comment thread to my recent post on QlikTech, where David himself also addressed some of my questions. But of course, no technology is perfect, not even one as great as David thinks QlikView is. >>Continue reading "David Raab Offers Kudos for QlikView" Posted Monday, August 25, 2008 8:18 AM >>Comments When to Use Modern DBMS Alternatives If there's one central theme in my DBMS2 blog, it's that modern database management system alternatives should in many cases be used instead of the traditional market leaders. So it was only a matter of time before somebody sponsored a white paper on that subject. The paper, sponsored by EnterpriseDB (disclosure noted), is now posted along with my other recent white papers. Its conclusion summarizing what kinds of database management system you should use in which circumstances is reproduced below. Many new applications are built on existing databases, adding new features to already-operating systems. But others are built in connection with truly new databases. And in the latter cases, it's rare that a market-leading product is the best choice. Mid-range DBMS (for OLTP) or specialty data warehousing systems (for analytics) are usually just as capable, and much more cost-effective. Exceptions arise mainly in three kinds of cases: >>Continue reading "When to Use Modern DBMS Alternatives" Posted Thursday, August 21, 2008 8:13 AM >>Comments Comparing Vertica, ParAccel and Exasol I talked with executives at Nuremberg, Germany-based Exasol last week at 5:00 am ET! and of course want to blog about it. For clarity, I'd like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol. And it feels like that should be a separate post. So here goes. >>Continue reading "Comparing Vertica, ParAccel and Exasol" Posted Tuesday, August 19, 2008 9:00 AM >>Comments Patent Nonsense in the Data Warehouse DBMS Market There are two recent patent lawsuits in the data warehouse DBMS market. In one, Sybase is suing Vertica. In another, an individual named Cary Jardin (techie founder of XPrime, a sort of predecessor company to ParAccel) is suing DATAllegro. Naturally, there's press coverage of the DATAllegro case, due in part to its surely non-coincidental timing right after the Microsoft acquisition was announced and in part to a vigorous PR campaign around it. And the Sybase case so excited one troll that he posted identical references to it on about 12 different threads in this blog, as well as to a variety of Vertica-related articles in the online trade press. But I think it's very unlikely that either of these cases turns out to much matter. >>Continue reading "Patent Nonsense in the Data Warehouse DBMS Market" Posted Friday, August 15, 2008 10:15 AM >>Comments
|
Blog Channels
The Brain Food Blogger SQL Puzzlers by Joe Celkoon Enterprise App Development on Changing the Enterprise by Shawn Shell by Kas Thomas Strategic Knowledge, by Dave Stodder Product Maven Subscribe to RSS feed of all blogs Archives
|
| |||||||||||||||||||||||||||||||

























