Open to DiscussionWill open source databases ever have a major role in business-critical applications?By Protima Banerjee Opensource software is firmly embedded in enterprise computing today. The gcc compiler for C/C++, the gzip file compression program, the emacs editor, the Apache Web server, all of which are in widespread use, are open source software and distributed under the terms of the GNU Public License (GPL), which gives users the right to freely modify and distribute the original source code without the imposition of copyright. The recent and rapid adoption of Linux across a variety of industries, often in mission-critical applications, further emphasizes the point: Corporate usage of open source software is now nearly commonplace.
Database software forms the foundation for enterprise computing; after the operating system, it is the next significant layer in building an environment based entirely on open source. Yet to date, there has been relatively little buzz about open source databases, and only a handful of cases where businesses have adopted the technology. Both Oracle and IBM have touted their Linux releases, but even this publicity has not solicited much of an interest in open source technology for the database itself. The question arises: Is open source database technology relatively immature when compared with its commercial counterpart? Or is there a more fundamental and pervasive fear at work here that cannot condone placing corporate data stores at risk by using software that is not heavily supported? After all, you can always restart an operating system or a Web server, but it is not so easy to recover order and product information after disaster strikes. The logistics associated with database management are many and varied, and it is the nature of a good IT executive to be conservative. Perhaps it is a combination of all these things that has made, and will continue to make, the progress of open source database software through the enterprise slow and cautious. I believe, however, that there is a clear place for open source databases in the enterprise today, and that in the future, their role will expand greatly. In this article, I will give an overview of the state of open source databases today and assess how they can fit into enterprise computing in the near and far term. Open Source Database Technology TodayThe most prominent open source databases today are MySQL and PostgreSQL. Although several other open source databases are available (such as GNU SQL and mSQL), I will focus this discussion on these two products in particular. Based on their current level of maturity, features, and usage in industry, they will provide us with a good representation of the advantages and disadvantages associated with the technology as a whole. MySQL is distributed under the terms of the GPL; PostgreSQL is distributed under the terms of the Berkeley Software Design (BSD) License. These two variants of open source copyright diverge slightly; the GPL requires that any direct copy or modified version of software distributed under its terms must also be open source and distributed under the same license. (In effect, this protocol ensures that open source software stays "open" regardless of who packages or distributes the product.) The terms of the BSD License are more general, stating that redistribution in code or binary format be accompanied by the BSD License and original disclaimer. Understanding the implications of these licenses is critical if the intended use of the open source database is as a part of a larger system that will eventually be packaged and sold. MySQL and PostgreSQL are available for download and use on most modern Unix variants and Windows; in my own experimentation and research, the Unix versions of both were considerably easier to install and better documented than their Windows counterparts. It also seems that there is a larger user community for the Unix variants, which is an important factor to consider because open source software is influenced so directly by its user and test community. Both MySQL and PostgreSQL are multithreaded servers, and both provide support for the set of SQL operations required to read and write data from a data store. Neither is fully compliant with the SQL92 standard, but this limitation is not uncommon in many commercial database engines. Complex SQL, in the form of subqueries and correlated queries, are largely unsupported. Both provide a set of standard functions for use in constructing queries and manipulating data, including basic arithmetic operations, set operations, string and date manipulation functions, and some complex mathematical functions. Users may also define their own functions for use in queries and data manipulation in both databases. However, unlike most commercial databases, users do not have the ability to create their own stored procedures. C, C++, Perl, Python, and Tcl APIs are available for both databases. MySQL supports additional APIs for Java, Eiffel, and PHP. The backup and restore capabilities for both MySQL and PostgreSQL are fully developed, and it is clear that developers have lavished a great deal of attention on this functionality. Backup and recovery scenarios are well documented for a variety of cases, and the backlog of support instances for recovery problems is kept online for reference. However, most recovery and backup documentation is for Unix and Linux platforms; I know of relatively little such documentation for Windows-based systems. Replication servers for both databases are available, but with caveats: MySQL documentation implies that replication is in beta; and PostgreSQL's server has only been available since the beginning of the year. Judging from the number of items still on the developer's "to-do" lists associated with the replication server, it seems that it might be a few months before replication is mature. MySQL supports ODBC, which makes it a good alternative for Windows or Web-based applications currently connecting through ODBC. A full-text indexing and search feature is also available, which would prove valuable for Web-based applications as well. MySQL supports the binary large object data type. The near-term releases of both MySQL and PostgreSQL focus primarily on bug fixes and small upgrades to existing features. At the top of the priority list for both systems are bug fixes for the respective replication servers. For longer-term releases, enhanced SQL capabilities are being added to both servers. MySQL is available in multiple languages today (no surprise, given the diverse background of the development team), but both MySQL and PostgreSQL plan full support Unicode for the future.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|











