Distributed Decision SupportAre grids a good step - or even a likely one - for mainstream decision support?by Seth Grimes Continued from Page 1 IBM's Globus backing is consistent with its support for not only its own operating systems but also Windows and Linux. By contrast, Sun's Grid Engine runs only on homogeneous Solaris and Linux platforms. Although I know of no inherent scalability limit in Sun's grid solution, it targets only campus and departmental users, perhaps because the hardware systems outside these contained environments are likely to be running something other than the two supported operating systems. And Microsoft announced a $1 million cash and in-kind investment in Globus earlier this year, reportedly so the toolkit would be ported to Windows XP. Lack of Windows support could have affected Microsoft's .Net strategy, which in essence aims to open the Windows platform to distributed network computing. BI AdoptionBusiness intelligence (BI) and operational-application vendors have held back even though their applications could reap huge benefits by distributing processing load and opening up to distributed data sources. Mark Battaglia, president of SPSS Inc.'s BI division, explained it to me this way: "This topic comes up from time to time as people worry about the explosion of data in certain industries, especially telecommunications. It also gets some attention in the government market. But it's very much the exception rather than the rule or even a trend among our customers and prospects. So it's still in the 'that's cool' stage, meaning that people are interested in it, but it doesn't seem to be on their to-do lists yet." BI vendors are naturally reluctant to get ahead of customer demand given intense competitive pressures. Their market is mature; basic reporting and online analytic processing (OLAP) are now commodities. Vendors are working to add value by extending their software to model and analyze cross-functional business processes, which can involve creating new Web services interfaces, and offer true personalization and real-time capabilities. A quick leap into the grid world, when vendors are already dealing with new functional goals and still-evolving standards, would introduce undesirable customer-relationship and revenue-model uncertainties. BI vendors could also face substantial technical difficulties grid-enabling their software. To start, BI vendors have limited experience with scalability features that are precursors to distributing processing. For example, SAS Institute Inc., the world's largest analytic-software vendor, has brought multithreading to its flagship software system only in version 9, due for general-availability release later this year. And most analytic services have significant operating-system and hardware dependencies, although adopting Web-service messaging interfaces and protocols such as XML for Analysis will reduce the effects. Third, BI typically relies on extract, transform, and load (ETL) operations to populate data warehouses. These operations entail substantial metadata development and data cleansing. The industry is shifting more to real-time data interchange via enterprise application integration technologies, which would play better in a grid, to tie disparate operational and analytic systems. ETL, however, still dominates data movements. I'd venture that vendors, such as Alphablox Corp., with tools coded in Java for the J2EE Web services environment will be first to the grid. They're already partway there, but we'll see. To Grid or Not To Grid?Organizations that code their own software are in a good position to evaluate grid computing's applicability in their environments. The most suitable problems can be decomposed into discrete, independent tasks that can be run in parallel and, perhaps, asynchronously. Even if your decision-support problems fit that profile, you must still decide if the right approach for you is multiprocessing, clustering, grid, or even vector processing. Is your problem best answered by a small number of fast processors for instance, if it's dominated by computationally intensive, serially executed tasks or by a larger number of slower ones? Can the data also be broken into chunks that are each assigned to a small number of tasks? Is there significant data or intertask traffic? Look for problems similar to yours that have been adapted for computational grids. Candidates include pattern recognition in large data sets, "finite element" style modeling, development of genetic algorithms, and large-scale simulations, in applications that include genomics, industrial and pharmaceutical design, financial-market and environmental modeling, and the like. If you depend on commercial software vendors, take steps with your vendor partners to implement Web service interfaces, and you'll be in good shape to jump to grids down the road. Although you may not get to tout your buzzword compliance, you'll have a good chance to see real gains in interoperability, resource utilization, and performance. Seth Grimes [grimes@altaplana.com] is a principal of Alta Plana Corp., a Washington, D.C.-based consultancy specializing in analytic computing systems and demographic and economic statistics. RESOURCESGrid Initiatives and Projects, Global Grid Forum: www.gridforum.org/L_Involved_Mktg/inint.htm The Globus Project: globus.org O'Reilly distributed computing links: www.openp2p.com/pub/t/73 "The Anatomy of the Grid," on grid-service protocols: www.globus.org/research/papers/anatomy.pdf "The Physiology of the Grid," on grid-service functional issues: www.globus.org/research/papers/physiology.pdf Platform Computing: www.platform.com "Web Services Description Language (WSDL)," WorldWideWeb Consortium technical report: www.w3.org/TR/wsdl
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















