|
Breakthrough Analysis, by Seth Grimes
Seth Grimes is an analytics strategist with Washington DC based Alta Plana Corporation. He consults on data management and analysis systems. See More by Seth Grimes TDWI Selection Bias: It Depends Whom You Ask
The saying "There are three kinds of lies: lies, damned lies, and statistics" is attributed to Benjamin Disraeli, and it's nicely illustrated in a couple of Intelligent Enterprise reality-check photos from this week's TDWI conference. Check out the TDWI Image Gallery photos posted by Intelligent Enterprise Editor-in-Chief Doug Henschen. Executive Summit attendees used the "dots method" to identify Important BI Technologies and Biggest BI Challenges. You get simple histograms showing what's hot and what's not. And you get clear illustrations of "selection bias": not TDWI's fault, but an effect to keep in mind when you assess formal and informal research findings. The Important BI Technologies results: 1 Dashboards/Scorecards; 2 Predictive Analytics; 3 Operational BI; and 4 BI Portals. My favorite topic, Text Analytics, is way down the list in a third tier. That's the reality check. (I hope this won't jeopardize the Text Analytics session planned for the August TDWI Executive Summit.) Open Source BI is even lower on the list. These results aren't lies; no, they're statistics. Selection bias comes into play because a BI-themed session will attract folks grappling with technologies and solutions that are, well, in the solid BI mainstream. If you're out in front looking at text analytics, a BI-themed summit is probably not the place for you. Similarly, as I have reported, it's Java developers and not IT execs who are most interested in Open Source BI. It's a good guess that text-analytics types and OSBI types tend to self-select away from TDWI. TDWI recognizes this effect. Philip Russom's 2007 report, BI Search and Text Analytics: New Additions to the BI Technology Stack, includes the explanation: In an Internet survey conducted in late 2006, TDWI asked each respondent to estimate "the approximate percentages for structured, semi-structured, and unstructured data across your entire organization." Averaging the responses to the survey puts structured data in first place at 47%, trailed by unstructured (31%) and semi-structured data (22%). Even if we fold semi-structured data into the unstructured data category, the sum (53%) falls far short of the 80-85% mark claimed by other research organizations. The discrepancy is probably due to the fact that TDWI surveyed data management professionals who deal mostly with structured data and rarely with unstructured data. All survey populations have a bias, as this one does from daily exposure to structured data. You'll find similar, similarly understandable examples of selection bias in the Biggest BI Challenges dots results, where #1 is "Gaining consensus on data definitions." Yup, that sounds like a BI/DW manager or exec speaking; the rest of us are stuck in our cubes, waiting for the execs to get back from TDWI in Vegas, keen to consensusize us. E-MAIL | SLASHDOT | DIGG This is a public forum. CMP Technology and its affiliates are not responsible for and do not control what is posted herein. CMP Technology makes no warranties or guarantees concerning any advice dispensed by its staff members or readers. Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of CMP Media LLC and may be edited and republished in print or electronic format as outlined in CMP Technology's Terms of Service. Important Note: This comment area is NOT intended for commercial messages or solicitations of business.
|
Blog Channels
Cindi Howson on Business Intelligence The Brain Food Blogger Tony Byrne on Content Management SQL Puzzlers by Joe Celko Rajan Chandras on IT & Information Management Seth Grimes on Analytics In Context by Doug Henschen Phil Kemelor on Web Analytics Sandy Kemsley's Column Two Nelson King on Enterprise App Development SharePoint TrendWatch, by Shawn Shell Enterprise Architecture TrendWatch, by Kas Thomas Natural Insight, By Mark Madsen Alan Pelz-Sharpe on Content Management Mark Smith on Performance Management Neil Raden on Business Intelligence Bruce Silver on Business Process Management Product Maven Subscribe to RSS Archives
|
| ||||||||||||||||||||||||||||||||










