http://www.intelligententerprise.com/010130/feat1.jhtml

IT and the NOW Economy

XML technologies can provide more options and flexibility in enterprise messaging

By Michael Hudson & Craig Miller

Your company has decided to take the plunge into e-commerce, and supply chain management is at the top of the corporate priority list. Old business practices such as hand-generated purchase orders, phone calls to track shipment status, and warehousing large inventories are too inefficient and expensive to maintain in today's competitive market. "Just in time" has become the new mantra: Aggressively manage your relationships with suppliers who ship inventory to you on an as-needed basis, and -- instead of being locked into the same vendor -- you have suppliers bidding for your business.

Although the vision is compelling, realize that it can be harder than you imagined if your company is standardized on Unix servers, but many of your suppliers are Windows NT shops. No single standard for exchanging data exists, and most of the solutions you've considered purchasing are really batch-oriented systems that lack the direct interactivity you require.

Welcome to the world of realtime, distributed collaboration. One system may directly invoke the functionality of an application on another in what is known as a remote procedure call (RPC), or two systems may collaborate by exchanging data through brokering mechanisms that are often referred to as message-oriented middleware (MOM).

Traditionally, however, solutions in this market have usually been much more at home in homogenous, closed LAN-based environments. Solutions have often been expensive, mainframe-oriented, or bound to a specific platform or programming language. RPC mechanisms illustrate the point vividly.

Traditional RPC mechanisms included large-scale solutions such as the Distributed Computing Environment (DCE), and later, object-oriented RPC architectures such as Microsoft's Distributed Component Object Model (DCOM), the Common Object Request Broker Architecture (CORBA), or the Remote Method Invocation (RMI) APIs supported by Java.

All of these technologies have great merit and have served as the basis of many enterprise success stories. However, there are significant obstacles to their widespread adoption in open environments. Each component architecture is commonly tied to either a platform (DCOM), a programming language (RMI), or a small number of solution vendors. Reliable deployment of these technologies often requires installation of large, complex client-side libraries in the case of DCOM and CORBA.

Component architectures are often ambitious in their scope; although they provide a plethora of services for the developer, there are often steep learning curves, and some require significant per-user licensing fees for runtime deployment. Moreover, each of the these component architectures is incompatible with the others unless you incorporate some form of bridging software. Although this is the case, component architectures provide a robust foundation upon which business components reside. These business components are utilized behind the various interoperability mechanisms described in this article.

Using HTTP as an RPC Transport Protocol

The greatest challenge for many of these technologies, however, are firewalls and proxy servers. Many firewall solutions employ a combination of packet filtering and network address translation to provide security. For instance, your company may block all traffic except that on the default HTTP port 80; moreover, many firewalls inspect the packets themselves to determine whether they constitute valid HTTP traffic. Finally, the actual target of an RPC call may have no publicly routable IP address; instead, packets destined for it are translated by a proxy mechanism to a private network address behind the firewall. These security mechanisms often foil RPC mechanisms, which often rely on other ports (typically, port 135 and a range of higher-numbered ports) and direct linkage to a known routable IP address.

Because HTTP has rapidly become the lingua franca of the Internet, it is a natural candidate to serve as an RPC transport protocol, particularly since it coexists with security measures and is not dependent on any platform or processor implementation. Indeed, the use of HTTP commands such as POST constitutes a kind of RPC mechanism, whereby "form" data is transmitted from a browser client as a lengthy concatenated string of values to a server that processes the data using server-side scripts. However, simple POST statements are poorly suited to passing complex, hierarchical data parameters of the kind commonly found in object-oriented programming languages and do not provide structured mechanisms for error handling.

On the other hand, extensible markup language (XML) is extremely well-suited for expressing complex data types, such as arrays, master-detail relationships, records, and the like. Like HTTP, it is an open and platform-independent standard that has rapidly achieved widespread adoption. Perhaps it is not surprising that efforts are underway to link the two standards. The best known initiative to date is the Simple Object Access Protocol (SOAP).

The SOAP protocol is the brainchild of many organizations, notably Microsoft, IBM, and others. Like XML, which is in part a greatly simplified descendant of the older SGML language, SOAP embodies a "less is more" viewpoint. One of its architects, Don Box, describes it as a "no new technology" approach.

SOAP is merely a specification -- to be precise, an XML implementation language with a specific schema. It packages the information contained within a remote procedure call and provides standards for error handling. The SOAP content specifies the resource on the server responsible for performing the invoked functionality, known as an endpoint. But how it is carried out is entirely at the server's discretion. (See Sidebar, "SOAPing It Up," page 28.)

As with all XML content, the schema defining the structure of the specific RPC call between client and server is referenced as an XML namespace. The content of the call (to get the last trading price of a stock with the symbol DIS) is contained within the SOAP "body", which is in turn contained within a SOAP "envelope" that provides the XML schema reference for the SOAP standard itself.
SOAPING IT UP

MAKING XML AND RPC WORK TOGETHER

To the Web server, a SOAP RPC call looks like a standard HTTP POST request, with a MIME type of XML and some specialized formatting:

POST /StockQuote HTTP/1.1
Host: www.supplierhost.com
Content-Type: text/xml
Content-Length: nnnn
SOAPMethodName: Some-Namespace-URI#GetCurrentItemPrice

<SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1">
<SOAP:Body>
<m:GetCurrentItemPrice 
xmlns:m="Some-Namespace-URI">
<stockitem>Palm V</stockitem>
</m:GetLastTradePrice>
</SOAP:Body>
</SOAP:Envelope>

The call to the SOAP method name (GetCurrentItemPrice) contains a parameter, in this case, stockitem, whose value is "Palm V." The server responds with an HTTP message containing the result as a response element within the XML:

HTTP/1.1 200 OK
Content-Type: text/xml
Content-Length: nnnn

<SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1">
<SOAP:Body>
<m:GetCurrentItemPriceResponse xmlns:m="Some-Namespace-URI">
<return>279.44</return>
</m:GetCurrentItemPriceResponse>
</SOAP:Body>
</SOAP:Envelope>

How Important Is SOAP?

Microsoft has made SOAP a critical technology for its .Net initiative; SOAP is the underlying RPC protocol that links events on HTML forms to specific triggers on Microsoft servers. Java and Apache tools for SOAP now exist as well. Whether SOAP will ultimately prevail as the dominant RPC mechanism in the Internet era remains to be seen. Not all major systems vendors are on board, and competing initiatives, such as the ebXML messaging standard, are beginning to arise. However, it is difficult to imagine that some XML over HTTP protocol will not ultimately emerge as the leading future RPC mechanism.

This does not mean all existing RPC mechanisms will be displaced in the foreseeable future. While the lightweight nature of the SOAP specification is a virtue, some enterprises may still benefit from the integration of other services that middle-tier application servers based on older RPC mechanisms provide, such as integrated network authentication, transaction processing, and naming services.

In particular, many organizations require a MOM solution. While RPC-oriented invocations of messaging architectures are very effective, they still rely on a synchronous link between the client and the server. These architectures are based on the assumption that an action or event will not only be sent, but some kind of response based on that action or event will be sent back. There are many times where a more asynchronous solution is needed. For instance, a stock quote server may be requested by multiple clients to send specific stock quotes to them. So a request is made, yet an immediate response to that request is not guaranteed, but a response will be handled eventually. A solution for this kind of asynchronous communication between multiple systems is a MOM-based messaging architecture.

MOM is a simple concept to understand. It describes a messaging approach where one system independently creates or publishes an event or message. Then other systems subscribe to that event or message without ever having knowledge of the system that originally published them. This not only creates a kind of asynchronous communication, but it also allows for a more decoupled architecture because no one system has to know about any other systems out there. Because of this decoupling, a MOM-based messaging system can be scaled much more quickly and efficiently.

Today, most MOM-based systems also save outgoing messages being published until the appropriate receiving systems are found, and thus do not require a constant connection between all of your systems. This process is called message queuing.

Getting the Message Across

In a pure publish/subscribe scenario, these systems do not care who gets the messages they are sending, or if the messages arrive. Sent messages are queued, and it is up to the receiving systems to request them. However, most queuing systems, if requested, can push these messages directly to the receiving systems without the messages being directly requested. Again, if the receiving system cannot be found, the messaging system will wait and continue trying until the receiving system is found and connected. This allows for asynchronous communication combined with an assurance that your messages will be received and handled by a specific system. For example, when a bid is made on an auction Web site, it may get stored in a queue of previous bids. The main system will then request each bid from that queue. If the system goes offline or must pause for a great length of time, the queue will still be there patiently waiting for the main system to come back and request the new bid. In this scenario, the message queuing system allows for the asynchronous nature of the transaction but also ensures that each bid will eventually be handled in some way by the main auctioning system.

In today's software development environment, there are two main MOM-based products that support message queuing: IBM's MQSeries, and Microsoft's MSMQ. Both provide immediate ways to implement a messaging system that can execute any number of requests or messages between multiple systems. If the platforms that you are using are mostly Microsoft based, then MSMQ will be more effective since it's written more efficiently and more elegantly to work with Microsoft technologies like DCOM. However, if you're using a large combination of platforms, many being non-Microsoft, then MQSeries provides a wide range of ways to interact with heterogeneous systems.

A Java Approach

A somewhat recent entry to the world of messaging is Java Messaging Service (JMS) -- Sun Microsystems' entry into the MOM-based world. As we mentioned previously, the two products mentioned earlier depend -- albeit in different ways -- on the underlying platform upon which the messaging architecture will be implemented. Also it is very difficult to get a MSMQ-based system to talk to a MQSeries-based system. Sun, on the other hand, has always touted its brainchild, Java, as a way to overcome the difficulties and the hassles of dealing with multiple platforms and in this case messaging systems. Thus the purpose and appeal of JMS is simply to provide a universal way to interact with multiple heterogeneous messaging systems and platforms in an easy and consistent manner.

You might confuse JMS as a separate implemented product. However, JMS is really only an API or interface that Sun proposes that other messaging products implement so as to have a universal way of accessing each other. This simplifies the platform-dependence problem and interacting between different messaging systems, but it also ties each system to using Sun's Java language to speak to each other. And although Java is a continually evolving language that is improving, it still does not meet the speed and efficiency of C++ -- the other main object-oriented language.

Benefits of JMS

However, JMS does provide many benefits. If RMI mechanisms are currently being used in your distributed environment, JMS is an easy addition since it can be implemented on top of RMI or even Java sockets. JMS also provides a great API for simplifying a developer's use of message queues. The learning curve for many of the proprietary messaging systems can be steep, thus the simple API that JMS uses can save a lot of time for new developers; and since JMS is becoming more of a standard, it may be the only API a developer might need to know.

JMS can also easily support XML documents as the messages to which a system is either publishing or subscribing. For instance, you might have a real estate business that needs to send updated mortgage plans to all of its remote offices and each one of these offices needs to process and use this data differently. You could put the raw data into an XML document; thus separating the actual data from how it may be processed or displayed; then use JMS to publish the XML documents. Each office will then have its own processing systems retrieve the XML documents by subscribing to it through JMS.

The benefits here are numerous. Regardless of whether the systems at the remote offices are down or if there are communication problems in general, each office will receive the mortgage information they need the moment the connections are reestablished. Since the mortgage data is in XML, each office can do whatever they want with the data without the main office having to be involved in it. In fact, because JMS is the constant interface and the remote offices are always subscribing to the same document regardless of the data in that document, the main real estate office does not need to change anything on its side to support any new uses of the XML document. They do not even need to know how that data is being used, nor do they even need to be aware that a new remote office opened and just began subscribing to that XML document. Thus by using the combination of JMS and XML, the original mortgage plan data can be used in various ways that are initially unknown at the time.

In terms of dealing with distributed component architectures, JMS is already closely integrated with Enterprise Java Beans (EJB). In the new 2.0 release of the EJB specification, a new message bean type is being introduced. This bean specifically uses the JMS API to create distinct subscriber and publisher components used only to handle enterprise messages and delegate them to the rest of the system including other EJBs. And finally, JMS can be used in facilitating the interactions between newer systems and legacy systems. By simply writing a JMS-conforming wrapper around the legacy system, every other system within your messaging architecture can communicate with that legacy system, thus sustaining the investment in that older system.

Obviously, with newer technologies like JMS and SOAP, messaging architectures and their accompanying technologies have evolved considerably over the last few years. Before the arrival of JMS and SOAP, companies were limited to only a few choices when they wanted multiple systems to communicate; moreover, those options usually didn't provide for much flexibility in terms of asynchronous processing, platforms, or programming languages, and were usually heavy and complex to use. With the tremendous popularity of the Internet, these new technologies have greatly improved these choices and added new flexibility to messaging architectures. They provide elegant, simple ways to connect multiple heterogeneous systems using current standards and little overhead. Understanding these new technologies in relation to older technologies should greatly improve the way multiple systems in your corporation communicate. And like any good messaging architecture, you'll end up with a whole that is hopefully much better than the sum of its parts.



Michael J. Hudson (mhudson@blueprinttech.com) is a framework engineer for Blueprint Technologies, a software architecture firm based in McLean, Va. His current work includes developing enterprise architectural solutions for clients such as NASA.

Craig Miller (cmiller@blueprinttech.com) is a senior software architect for Blueprint Technologies where he has worked on distributed Web-enabled systems, XML e-commerce solutions, large-scale systems architecture and software process engineering.

Return to Article