TOWARD THE GLOBAL INTELLIGENT ENTERPRISE World ClassIf your strategic business app is expanding beyond departmental and national boundaries, select your scalability strategy carefully
By Debashish Bhattacharjee Continued from Page 1 The benefits of caching include faster access times for cached content (because access times from cache are faster than access times from remote Web servers), improved bandwidth utilization (because access from cache reduces upstream network traffic and overall network traffic), and reduced load on the Web servers (because the cache servers can help with the deployment of resources). When using caching, however, you need to carefully consider your deployment strategy, or "cache hierarchy," for the cache servers. Typically, cache servers are deployed in a forward or reverse proxy configuration. In a forward proxy configuration, client browsers are pointed to a forward proxy, transparently (using a Layer 4-7 capable switch) or nontransparently. The forward proxy intercepts all HTTP requests and serves the content from cache. This configuration helps reduce upstream network bandwidth and improves response time for server content. In contrast, a reverse proxy configuration involves setting up the cache servers so that they sit in front of the Web servers. In this scheme, the cache servers reduce the load on the Web servers. When in Doubt, ReplicateIn the replicated model, instances of the application are replicated across strategic geographic locations. These instances are replicas of each other and service a local group of end users; basically, each instance adds to the application's global load-sharing capabilities. Each replicated site can be a fail-over for another site during disaster recovery scenarios. The location of each replica is a key factor in your distribution strategy because each replica is used to reduce dependency on the WAN. For example, in a company with a global presence, replicas in Sydney, Sao Paolo, and London could be used to service remote offices in Australia, Brazil, and the United Kingdom, respectively. The mechanics of the replication process require the replication of hardware, software, database, and application files. Installing a replicated site for the first time is easy enough to do. The problem is maintenance the need for ongoing replication between the replicated sites. When application code is developed or changed, it has to be distributed to all the different sites; system configuration changes also need to be propagated. The most difficult task, however, is the replication of the data in the database, which often forms the core of all strategic business applications. Database replication implementations vary depending on the vendor. However, the following principles apply for all databases: To begin with, the end user needs to create a logical data model for the replication schema (which entities need to be replicated). A replication schema can consist of full, vertical, or horizontal replication. (Vertical replication means that only certain columns in a table will be replicated. Horizontal replication means that only certain rows in a table will be replicated.) When designing the replication schema, the application architect will need to consider data consistency issues. You'll need to take primary keys, foreign keys, and relationships among tables for each of the replicated entities into account to ensure that replication entries don't violate referential or primary key constraints. Furthermore, there's always the possibility of conflicts, which can arise when concurrent transactions lead to database inconsistencies. The application architect will need to decide how to deal with them. (For example, should the application roll back entire transactions?) Conflicts typically arise in two-way replication scenarios. One-way replication is simpler and always provides a more robust implementation. However, it's restrictive because updates can only be made in one instance. Balancing ActThe replicated model can be extended to incorporate global load balancing. The general concept is that each replica represents a possible processing unit, and all processing units contribute to the system's overall throughput and also help with automatic fail-over scenarios when a site is unavailable. This goal can be accomplished using Layer 4-7 capable load-balancing switches, such as those from Foundry Networks Inc. Global load balancing can be accomplished by using switches to "front" for the authoritative DNS server. In this configuration, multiple DNS servers can be tied to the switch. The switch acts as a proxy for the authoritative DNS server. The IP switch reads the DNS map from the authoritative DNS server and performs sophisticated health checks to determine which servers are functional and which are available for processing. The switch also mitigates problems related to the "Time to Live" parameter associated with the authoritative DNS server. This model is more efficient than DNS round-robin because of the switch's capabilities that extend beyond IP (Layer 3) to Layer 4 and Layer 7. The switch is also aware of geography and the response times from other switches that load balance a local Web-server farm. Content is KingContent delivery networks consist of "origin" servers and a massively distributed edge network. The edge servers sit on the "edge" of the network and report to the "core," which comprises origin servers and controllers that monitor the network's performance and dispatch users to the edge server that will provide the best performance. Customers can purchase these infrastructure components from vendors such as Inktomi and implement their own content delivery network, or they can opt for the outsourced model from vendors such as Akamai. Under the outsourced model, the customer generates the content and the CDN delivers it. In this architecture, the origin server or application server and database remain within the customer's premises. A gateway server connects the application server to the external content delivery network, which consists of a massive set of edge servers that are distributed across the globe. When the user needs to access content, the content delivery network directs the user to the optimal edge server. The edge server attempts to service the HTTP request, if it has fresh content. The edge server may request an object from the origin server if that object has expired. When the edge server has assembled all the components, it can then deliver the HTTP request to the user. For example, a user in Sao Paolo who connects to the local instance of the sales and marketing application would receive the product brochure from an edge server in Brazil and the latest in corporate sales figures from a server in the United States. The result is that pages in the sales and marketing application render more quickly. Content delivery networks support high scalability and availability because of the large number of edge servers, which are geographically closer to end users. Furthermore, the content delivery network provider can use global load-balancing techniques to offload work from busy sites or redirect users to sites that are available. The architecture's weak link is the dependency on the origin server. However, persistent, multiplexed TCP/IP connections between the edge and origin servers can help mitigate this problem. Furthermore, the performance improvement with content delivery networks is self evident when you consider that the majority of a page's download time is consumed by static content such as image files, which are easily cached on the edge servers. Which Road?Content delivery networks tend to be the most scalable option for applications that primarily feature static content, such as Web sites that feature news and entertainment and marketing literature. Applications that are heavy on transactional data or have significant amounts of dynamic content such as collaborative commerce apps may not be as scalable because of the dependency on the origin server. Content delivery networks also tend to be highly available if an appropriate fail-over for the origin server exists. The issue that some companies may have with content delivery networks is the shift in the IT management paradigm toward outsourcing. At the other end of the spectrum is the centralized option, which is simple and easy to maintain but lacking in scalability and availability. The replicated option is usually explored by companies that want to own and control their data and infrastructure, but are interested in a highly scalable and available system. As with all business solutions, customers must select the distribution model that most closely meets their business requirements. Debashish Bhattacharjee [dev.bhattacharjee@us.pwcglobal.com] is a management consultant with PricewaterhouseCoopers. He has 10 years of experience in the IT industry implementing projects for Fortune 500 clients. RESOURCESAkamai: www.akamai.com Exodus: www.exodus.com Inktomi: www.inktomi.com Packeteer: www.packeteer.com
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















