Shrink WrapA look back at the (mostly) diminishing IT software sectorContinued from Page 1 A Sequel to SQLDATABASE MANAGEMENT In Intelligent Enterprise's roundtable discussion on the future directions for the database industry, the industry experts were unanimous in their opinion that upcoming extensions to the language will keep SQL vibrant for years to come. (See "What's Next for the Database?" May 9, 2002.) In fact, the committee working on the SQL standard is nearing completion of a new version of the standard, which will probably be published sometime in 2003. The new standard will have nine parts, most of which will call for extensions to the language. SQL: 2003, as the new version of the standard will likely be called, introduces a new data type called multiset. Conceptually, a multiset is an unordered collection of elements, all of the same type, with duplicates permitted. Tablesample is a new online analytic processing feature introduced in the standard. It can be used in statistical analysis to compute aggregates on smaller samples of data while working with large data sets. SQL: 2003 also contains a new merge statement that you can use to update existing rows in a target table or insert new ones in a single SQL statement. We have long been using identity columns and sequences to generate keys for unique columns. Finally, the new SQL standard formally identifies them. The existing SQL/MED standard for management of external data published in 2001 is also being enhanced in the new release. Now, it supports complex queries of externally stored nonrelational data and even in-place updating with full recoverability, using a feature called datalinks. The biggest enhancement in the upcoming version of the standard is undoubtedly the XML extension to SQL, bundled together in a brand new part (Part 14: SQL/XML). It includes the basic mappings of SQL identifiers to XML names, SQL data types to XML schema data types, and SQL values to XML values. Building on this framework, entire tables can be mapped to XML documents. Furthermore, an XML data type has been created in SQL, allowing us to produce XML from existing SQL data. SQL: 2003 will be a giant step for the database industry in standardizing on the key extensions to SQL. Ganesh Variar [ganesh_variar@yahoo.com] is a lead analyst at Regence BlueCross BlueShield of Oregon. He has been managing and designing BI solutions for eight years. Taming TextUNSTRUCTURED DATA MINING It's as true in text mining as elsewhere: With the arrival of what's new, previously advanced methods become more generically available, as well as creatively and synergistically combined with other technologies. SAS Institute Inc., for example, has added basic text mining to its drag-and-drop Enterprise Miner package. Hewlett-Packard is using SAS Enterprise Miner to mine, in real time, conversations with customers to give salespeople a red, yellow, or green light as to the best time to make a proposal. Attensity Corp. has taken text feature-recognition capabilities to new heights, letting users mine information on "who did what to whom where, when, and why," rather than just mining on single word distributions. Metacarta Inc. combines feature recognition with global positioning. For example, upon entering the key word "resume," a distance, and your location, you get all the resumes of your neighbors available on the Internet. Neuro-Technology Solutions lets you pull text from a vast library of literature about the brain by clicking on a 3D image of the brain. SemanTx Life Sciences Inc. uses a large store of synonyms and grammatical and functional relations to allow finding articles valuable to your research, even when they are written in entirely different conceptual silos or languages. After immense corporate disasters, some litigators and prosecutors have used Cricket Technologies LLC to rapidly extract the subsets of data relevant to their cases from tractor-truckloads of documents in hundreds of media types. Blossom Software uses Web robots, which understand the syntax of the components of just about any Web page, to gather data and monitor changes for mining and contingent action-taking while automatically creating up-to-date searchable indexes. Sageware Inc. brings the power of classification to large bodies of content, using rich ontologies. It answers the question "What is this segment unit about?" For example, for a supply chain application, the answer could be a universally identified commodity (even if it isn't mentioned directly). Barry Grushkin [blg23@cornell.edu] is CTO of the Machine Intelligence Co., which implements and analyzes vertical solutions and technologies in CRM, business process optimization, and knowledge management using next-generation approaches. Sergiu S. Simmel [sss@clepsydra.net] is a principal at Clepsydra Systems Inc. An entrepreneur and executive leader, he provides full-life cycle product management, interim executive leadership, and due diligence services. RESOURCESAttensity Corp.: www.attensity.com Blossom Software: www.blossom.com Cricket Technologies LLC: www.crickettechnologies.com ISO/IEC/JTC1/SC32 & National Body Work Group (responsible for SQL: 2003): www.sqlstandards.org Macromedia Inc.: www.macromedia.com Metacarta Inc.: www.metacarta.com Neuro-Technology Solutions: sbeardsl@bu.edu Sageware Inc.: www.sageware.com SAS Institute Inc.: www.sas.com SemanTx Life Sciences Inc.: www.semantxls.com
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















