Overview of HCE Features
![]()

Menu Buttons
: Import Clustering and
Export Clustering
:
You can choose to either filter out or grayed out the item unselected by
Minimum similarity bar.
You can do it by toggling this button.
: You can compare 2 different clustering results by clicking on this button
Analysis Sample
The comScore panelist-level database captures detailed browsing and buying behavior by one hundred thousand Internet users across the United States. The panel is based on a random sample from a cross-section of more than 1.5 million global Internet users who have given comScore explicit permission to confidentially capture their Web-wide activity. Click here to access the data legend.
The data analyzed is a subset on transactions that occurred on a specific day in several e-retailers website such as amazon.com, buy.com, wallmart.com and targert.com.
Screen #1

This screen snapshot shows that there are 11 clusters a this point. A closer look at the cluster shows that individual clusters were merged
based on the online buyer profile. For instance, buyer with the machine id# 3799719 has 2 observations that have been merged into one
cluster. The buyer is shopping on bestbuy.com for two items priced at $29: a Microsoft XBOX controller and a gathering of developers
Max Payne. Demographic information tells us that the buyer is white with an income greater than $100k. He is 35 to 39 years old,
and he uses a narrowband connection (Dial-up).
Note that I used the Hierarchical clustering method. You can choose another method by just clicking on the menu item Clustering. This method merges
the closest observation into one cluster (smallest distance). The highest level of similarity (=1) results in 11 clusters.
Screen #2

I decreased the level of similarity to 0.8. I have now 26 clusters, and the buyer with the machine ID# 3799719 (let's call him John Doe)
is now in a cluster with buyer with machine ID# 82933. See below a closer look at the cluster.

Screen #3 (Profile Search)

The grayed area, called the silhouette, illustrates the “profile” of the population on this data set whereas the two lines points up the profile for ID# 3799719 and #82933.
What does the silhouette tell us about these online shoppers? There is a wide gap in the prices of products. The most expensive product costs
Noteworthy Findings
"Household with children are more likely to have broadband internet connection."

At this level of the similarity bar, there are two clusters based on the connection speed. I have highlighted the cluster with dialup internet speed (left). Looking at the second bar from the bottom, you can see distribution information across for CHILD_PRESENT. It seems like there is more red data point (child present=yes) on the non-highlighted sector (broadband cluster) than on the highlighted one (dialup cluster).
An e-retail website could use these type of information to launch a marketing campaign. For instance, they could create banner on popular website that would change depending on the internet connection speed. A dialup customer would be target with more adult centric products where as a broadband customer would have displayed products more geared towards parents and/or children.
This is just one application of how the tool gives greater insights to managerial decision makers.
"Shoppers online are very focused on their shopping. Their basket at the end of their shopping is not as full as offline shoppers"

A look at the scatterplot tells us that there is a strong correlation (0.511) between the individual product price and the total basket price. As a matter of fact, this is the second strongest correlation when we analyze all data side by side (see "Sorted Scatterplots" window). In other words, online shoppers has more of a "one item at a time" shopping behavior. Amazon is a mall online but since shopping online is more convenient than shopping on a mall. Shoppers will buy more a needed basis. Offline shoppers would buy more items during a day at the mall because it is more convenient than having to come back a couple of hours/days later for another item.
Is there a difference between "dialup shoppers" and "broadband shoppers"? Let's look at the screen below.

I have highlighted the cluster with broadband internet connection. On the profile search tab, you can see the "silhouette" which is the gray area based on the complete sample. The dark grey bar are the data points for shoppers with a broadband connection. Looking at the BASKET_TOT item, we see clearly that the most expensive items have been bought by dialup shoppers. It could be that these shoppers tend to live on remote area, far from shopping mall. Consequently, they rely more on the internet and they shop a wider variety of product based on price.
We have learned great insights about this data from the Amazon.com. Using the intelligence gained, managers can then develop more effective marketing, financial or strategic decisions.