CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise





February 1, 2002

Finders, Keepers

Point-spatial searching is an emerging requirement for RDBMS-based applications — so don't get left behind

By Marty Himmelstein

Location information in databases is ubiquitous. Some notion of location — such as street address, city name, postal code, telephone number, or even the name of a landmark — is an integral part of many database implementations. Considering that e-business still accounts for less than 1 percent of retail transactions, information about location will continue to play a prominent role in the databases we design and use. Much of our daily activity is conducted outside the confines of our homes: salespeople visit clients, families take vacations, realtors sell houses, distributors replenish inventory. Databases that are built to model these activities must take location into account.

EXECUTIVE SUMMARY

Marty Himmelstein

The demand for databases that can perform efficient location-based searching is increasing, fueled by CRM, mobile apps, and users' expectations for database systems to duplicate functionality available on the Internet. This article discusses how efficient location-based searching can be added to database systems.

Business intelligence and CRM applications can use location information in many ways. For example, Ford Motor Co. sent direct-mail pieces to current and potential customers, each individualized with a map of the dealer closest to the recipient's address. For business intelligence, queries such as "Summarize, in 10-mile increments, the distance customers traveled in each of the last four quarters to complete an in-store transaction" enable spatial information to be integrated into analytic processing.

Even in the virtual world of the Internet, applications rooted in the real world of bricks and mortar, such as electronic yellow pages apps and store locators, are still among the most popular. Location awareness will also play a major role in fueling the growth of mobile computing, a sector poised for eventual dramatic expansion.

POINT-SPATIAL SEARCHING DEFINED

The applications I've described are representative of a class that has two characteristics. First, regardless of their presentation platform — be it a traditional browser, handheld device, or phone — they rely on dynamically generated, centralized information for their results. Relational databases are the predominate tool for maintaining and serving this dynamic data. Second, they perform geospatial searching centered on the simplest geometric object, the point. The main point-spatial operations these applications perform are to:

  • Determine the locations (points) within a given circle or rectangle. The center of the circle or rectangle corresponds to the user's search center: a home, lodging address, location determined by a Global Positioning System (GPS) receiver, and so on. The search radius is chosen by the user or is predetermined by the application. It is often necessary to include nonspatial selection criteria when computing the locations to include in the result set.
  • Compute the distance between two locations.

More general two-dimensional (2D) spatial searching performs operations on lines and polygons, as well as points. The geometric algorithms and spatial access methods (SAMs) required to efficiently work with arbitrary 2D objects (often called geometries) are quite a bit more complex than those restricted to points. (Operations that involve the relationship of points to either lines or polygons cover a middle ground.) For example, a general-purpose spatial database must be able to determine if two polygons intersect, and if they do, be able to return a new object defined by their intersection.

Point-spatial searching has it own subtleties. However, 2D spatial access methods can be tailored for point-spatial searching so that they work well with standard RDBMS systems. But the capabilities necessary to manipulate complex 2D geometries require either vendor-provided spatial extensions, such as Oracle's Spatial Data Option, or geographic information system (GIS) products, such as ESRI's ArcInfo.

CURRENT ALTERNATIVES

Some commercial RDBMS vendors provide spatial extensions for their core database products. (In addition to Oracle, IBM's DB2 and the open source PostgreSQL databases provide spatial functionality.) The products support a general set of spatial capabilities for supporting 2D data. But as is so often the case, generality comes at the cost of performance, usability, and specific functionality. The databases can be used, say, for point-spatial searching or map data retrieval, but are far from best-of-breed for either application. For the major RDBMS vendors to increase their market penetration for spatial data, they need to improve their products. A practical approach would be for them to tailor their spatial offerings for specific market segments. Point-spatial searching would be a good place to start for the reasons I previously described. And, as I'll describe further, it is less traumatic to integrate a good point-spatial service into an existing RDBMS than it is to support more complex geometries.

Other than relying on database vendors to provide point-spatial support, two other solutions are currently more popular. There are several Internet-based application service providers (ASPs) that provide geocoding, point-spatial searching, and mapping and driving direction services for Internet-based business locators. (Vicinity Corp. is a leading provider of such services.) An example is the "find drop-off locations" section of the FedEx Web site. User requests to find the closest package drop-off sites are routed to the ASP. The ASP initiates a dialog with the users to determine their location, then executes a point-spatial search, and optionally provides maps and driving directions. It is the client's responsibility to periodically transfer accurate business data (such as name, phone number, address, and nonspatial attributes) to the ASP.

Sometimes, outsourcing is an ideal solution. In other cases, however, organizations are either unwilling or unable to store sensitive data offsite. And because spatial searching is performed as a separate application, you can't tightly integrate it with other corporate data and applications. The use of location information in BI applications will languish until RDBMS and BI tool vendors add capabilities to manipulate it efficiently, cost-effectively, and in a way that is easy to use and deploy.

Another way to accomplish point-spatial searching is to license a software package that contains this functionality. (Whereonearth.com, for example, provides good documentation on its SDK, which includes not only point-spatial searching but also geocoding and mapping services.) These packages require users to create data extracts that are loaded into a proprietary application. These solutions have some of the same drawbacks as the ASP approach; they're separate applications that require duplicated data and don't provide tight integration with other corporate information.

The best way to geoenable data in relational databases is for database vendors to provide appropriate functionality. Alternatively, independent software vendors could develop products that are closely integrated with various vendors' databases. Such an approach is feasible because the major database vendors have provided mechanisms for extending their products with customized functionality. In either case, the basic approach is to provide several built-in functions to do basic point-spatial searching. This way, users could incorporate point-spatial proximity queries into applications without facing an array of impediments.

SERVICES RELATED TO POINT-SPATIAL SEARCHING

Other than the point-spatial search itself, which I'm proposing should be a standard database function, several processing steps that require separate applications are commonly associated with point-spatial searching. The first is geocoding, which is a necessary preprocessing step. After a point-spatial search is executed, the results are sometimes displayed on a map or used to compute driving directions, both of which are optional post-processing steps. Each of these services is provided by different applications (even if they are bundled together), with distinct data and processing requirements.

Geocoding refers to the process of converting textual location information into a pair of 2D coordinates that correspond to a location on the earth's surface. Textual information is usually specified as a partial or complete address. The coordinate system most commonly used to express location is latitude and longitude. The accuracy with which an address can be converted into a pair of coordinates depends on the quality of the source address, the geocoding software, and the street reference database involved. For less than $200, you can purchase a simple database of mappings between ZIP Code (or cities and states) and latitude and longitude (lat/lon). More thorough and expensive solutions are available from companies such as Geographic Data Technology Inc. (GDT) and NavTech, which license data and software capable of achieving high levels of geocoding precision. (Geocoding accuracy for countries outside the United States varies widely.)







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address