|
|
||
|
http://www.intelligententerprise.com/010308/feat2_1.jhtml Beyond the Shopping CartA case study of using offline data to find your best online customers
By Jesus Mena Customer retention and repeated sales is really the only salvation for dot-com profitability. But as Wall Street has shown us over the past year, business-to-consumer (B2C) e-commerce profitability is a difficult nut to crack. Studies have shown that e-retailers spend an average of $100 to $250 acquiring a new customer who then only spends about $24, with most never returning. On the average, only 35 percent of buyers make a second purchase at a site they buy from initially. Given these dismal figures, it is quite apparent that if you want to survive and be profitable you need to go beyond cookies and meta tags. You need to learn what your online customers are like offline. Unfortunately, the traditional ways of gathering information on your Web site visitors and customers are limited. To foster long-term customer relationships and repeat online sales, you need to do more than track clickstream behavior, which is what most of today's ad networks (DoubleClick), collaborative filtering tools (NetPerception), Web data warehouses (PrimaryKnowledge), and customer relationship managers (E.piphany) do. Imagine walking into a brick-and-mortar store and - on the basis of what aisle you wandered into and what you picked up or examined - that retailer is going to know what you will eventually purchase. This flimsy premise is why a predominant number of ad networks and e-commerce systems are tracking cookies and meta tags to try and cross- and up-sell online. Customer Retention Is the KeyThe truth is that this kind of tracking may help e-retailers sell some low-hanging fruit like CDs, but it won't suffice in establishing profitable and long-term customer relationships or selling high-ticket items like sports utility vehicles. Fostering repeated sales requires knowledge about customers' preferences, consumption rates, behavior, and lifestyles. This generally requires knowing things such as a customer's age, income, values, lifestyle, and life stage, which can let, for example, a financial site know whether to offer an auto loan, a credit card, or a savings account. A visitor from a ZIP Code with the demographics of a "college campus" is a prime candidate for a credit card but not a home loan. This approach is more than cross- or up-selling; this is anticipating the needs of your customers and providing solutions that make sense to them. For the credit card issuer, capturing a college student can lead to a long and profitable relationship. The point is that the content provider or e-commerce site that develops these types of demographic profiles will get a bigger share of their online customers' business over a longer period of time. Who Buys Air Conditioners Online?This case study analysis involves an e-commerce site that sells air conditioners online, primarily the kind that fit into windows. We wanted to go beyond log analysis, which is all that this e-retailer had been doing, to determine the best way to reach its customers and increase sales. We wanted to identify the lifestyles and preferences of the online customers; we also wanted to find out which neighborhoods they came from; what their households were like; and the types of structural dwellings in which they resided. In order to discover the characteristics of these online customers it was important to not only know who the site's "buyers" were, but also who their "browsers," or nonbuyers, were. To obtain such insight we used two data sets created from purchase and contest forms. Having registration and order forms on your site is critical to the creation of visitor and customer databases and, of course, consumer profiles. Forms represent a feed-forward system by which customers can tell a retailer what they want and who they are. It is best to use commercially provided offline demographics matched by a ZIP Code or a physical address rather than asking online visitors to provide them. Two things can happen when you ask for age or income data from your visitors: They will either not provide it or even worse, provide incorrect information. Commercial demographics include information on consumers' personal, household, financial, recreational, home, and auto ownership characteristics and have been used by database marketers for years in segmenting their customers and potential prospects. There are several sources of demographics at various levels; for this case study analysis we used data from CACI, Acxiom, and Data Quick. CACI provided us with neighborhood demographics; Acxiom gave us household-level psychographics; and Data Quick appended real estate-related information. These external offline demographics can tell you who your online visitors and customers are, where they live, and subsequently how they think, behave, and are likely to react to your online offers and incentives. The demographics and socioeconomic profiles are aggregated from several sources including credit cards issuers, county recorder offices, census records, and other cross-referenced statistics. ZIP Code AnalysisWe started our analysis by appending ZIP Code demographics from the CACI's ACORN neighborhood segmentation system. This includes a breakdown of all U.S. ZIP Codes by the number of households, grouped by multiple consumer segments. Using a segmentation technique, we found that our client's online products were being purchased in neighborhoods where almost 40 percent of its residents were foreign-born, and many spoke a language other than English at home. Our segmentation analysis found that a high percentage of online sales were from neighborhoods where the demographic profile consisted of a rich mix of ethnic and racial groups: Neighborhoods where nearly 60 percent were married-couple or single-parent families; with a median age of 37.9 years; households where workers had commutes; with over a third of them crossing county or state lines on their way to work. The consumers were affluent with a median income of $48,900, well educated, and gainfully employed, most of them held a bachelor's or graduate degree, and worked in professional or managerial positions. We found that the Web site's online customers were primarily renters from the urban canyons of large cities, living in high-density, high-rise, pre-1950s apartment buildings. They tended to be urban, mobile, apartment dwellers from densely populated, central city locales in the largest metropolitan areas on the East Coast. This indicated that apartment dwellers were purchasing window air conditioners for older buildings that did not have central climate control. Our ZIP Code analysis found that these consumers were very highly concentrated in New Jersey and New York (see Table 1). Household AnalysisTo further define the profile of these online air conditioner customers, we appended demographics from Acxiom to their physical addresses. We did this in order to obtain consumer information at the household level (see Table 2).These demographics represented important lifestyle information, which could reveal hidden associations in relation to air conditioning products, such as type of dwelling, and a consumer's age or income level. In order to explore these relationships we used a rule-generating inference tool to segment the data. We found that sales tended to be higher to households in multifamily dwellings:
This confirmed the findings of the ZIP Code analysis and the high concentration of renters. We also validated higher sales rates to households with single adults:
We also found that most of their online customers did not own automobiles:
Lastly, we found that the age of their online customers also was a factor affecting their sales; we observed this reoccurring pattern in the ZIP Code analysis and then again in the household-level analysis:
These findings coincided with the findings from the ZIP Code analysis: sales were primarily made to single individuals in their 30s. We found that sales were higher to households in multiple-family dwellings; in other words, sales were highest to apartment dwellers, most of whom do not own automobiles, but instead used public transportation. Real Property-Level AnalysisWe also appended real property data from DataQuick based on information extracted from county assessors' and recorders' offices. This data provided detailed information about property ownership, age, size, and structural dwelling type. This analysis was designed to find associations between the air conditioner sales and the type of structural buildings or homes of this e-commerce site's online customers. Using their physical addresses, we appended the attributes to their Web data (see Table 3). The purpose of this analysis was to see if unique, physical, real property features were affecting this air conditioner site's online sales. One of the first segments we discovered was the following IF/THEN rules, which coincide with a ZIP Code analysis that identified the neighborhoods of these consumers to be high-rise, 1950s apartment buildings:
The following rules found significance based on dwelling size:
The dwelling size indicated these buildings were primarily rental units. Yet another set of rules found an association between rate of sales and the size of the building structures:
Again, the real property analysis confirmed the trends discovered in the ZIP Code and household-level analyses; sales were higher than average for multiple family dwellings, which physically represented rental units. Web Mining Digs DeeperPrior to this analysis, this e-commerce site had limited its analytics to log analysis using IBM's Surf Aid service. However, log analyzers like these are confined to using the Extended Log Format files for generating their limited reports, which means that they only report on the number of visitors and how they came to the site - such as the "keyword" or search engine visitors used. Another field is reserved for cookie reporting, but in most instances the only value this field can tell you is whether a visitor is new or returning. On the other hand, this Web mining analysis went much further in defining who their online customers were. It found that their online customers tended to be renters, most were single, affluent, in their mid-30s, did not own an automobile, but commuted to work, and lived primarily on the East Coast. Reaching the Right CustomerBecause the majority of their online customers did not own cars, we did not recommend radio advertising, which this e-retailer had been doing in the marketing of its Web site. Instead, we recommended that it place ads in buses and subways because the majority of their online customers took public transportation. Billboards near subway entrances would also be good places to position ads for the Web site. For the placement of their online banner ads, we recommended that they place them in regional (East Coast) sites catering to males in their mid-30s, such as renter locator sites. We also recommended bilingual ads in Spanish. This Web mining analysis quantified the suspicions this e-commerce site had about its online sales. Furthermore, using the same data we used for this segmentation analysis, we constructed a "propensity to purchase" model using a multilayer perceptron neural network. The purpose of the model was to identify online prospects based on their neighborhood demographics. The model used a ZIP Code as an input in order to predict whether a new visitor was likely to make an online purchase. The model had an overall accuracy rate of 84 percent on a training sample size of 3,501 records, correctly predicting the outcome of 2,945. To validate the model, we tested it on a held-out data sample of 19,319 records. Of 2,555 sale accounts, the model was able to correctly classify 1,832 or 72 percent of them. In other words, the model could correctly spot seven out of 10 sales prospects as they visited this online store. We could generate XML or Java code from this data mining model in order to identify new visitors with similar demographics by simply passing their ZIP Code via a form in order to make dynamic offers in real time while they were still at this e-commerce site. It is absolutely critical to make such quick offers to potential prospects, because on average only about one or two out of 100 visitors to a site are buyers - with most visits lasting an average of eight to 20 seconds. Furthermore, a recent Media Metrix and McKinsey & Co. study found that 55 percent of all visitors go to fewer than 100 sites out of the millions that are out there. The message to all e-commerce and content provider sites is clear: Be quick in identifying your potential prospects because you have a very small window of opportunity to close a sale and hold on to a visitor.
Jesus Mena (jmena@webminer.com) is the author of Data Mining Your Website (Digital Press, 1999) and is the CEO of WebMiner. |
||