How eCommerce Businesses Are Using Demographics and Data To Understand You Better


A Case Study

In October 2006, Netflix, announced a public contest to improve their own internally developed recommendations engine, Cinematch, as their internal efforts to improve this had long since plateaued. Cinematch is the bit of software embedded in the Netflix Web site that analyzes each customer’s movie-viewing habits and recommends other movies that the customer might enjoy. Netflix recognized that the potential sales lift was so great, they offered a grand prize of $1 million to improve upon Cinematch by at least 10%. There were over 50,000 contestants.

Several teams of competitors across the globes entered the competition with the prize awarded nearly three years later. They all worked from the same training data set provided by Netflix. At first, most contestants gravitated toward a technique called collaborative filtering. It would take, for example, a song you rated highly, find other people who also rated it highly, and suggest you try another song that those people also liked. But this approach could never reach the desired goal of a 10% improvement. As things progressed, most teams gravitated toward a different approach called Singular Value Decomposition (SVD), a technique from linear algebra which deals with complex matrices. And then finally, the winners, team BellKor, used what’s known as a Restricted Boltzmann Machines (RBM) technique to push their algorithm over the top and claim the million dollar grand prize. A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. You can read more about their winning solution here.

After the contest, Cinematch suggestions were so good, so critical to Neflix’s growth and success, it drove 60 percent of DVD rentals. Think about that for a moment. Sixty percent of their customers’ delight, their interaction with Netflix, was driven by a recommendations engine.

Interestingly, the Netflix Prize competitors did not know anything about the demographics of the customers whose taste they were trying to predict. Even though Net­flix has a good deal of demographic information about its users, the company does not currently use it much to generate movie recommendations. Knowing who people are isn’t very predictive of their movie tastes.

This is not always the case, however, as many outcomes, tastes, and behaviors are better predicted with demographics considered. Examples where demographics are good predictors include restaurant preferences, voting habits, the propensity to buy diapers or a minivan, and so forth.

Lastly, one should note that eCommerce businesses often have a significant advantage over brick-and-mortar stores. They can track everything their customers do on their website. Every page visit, every purchase made, every item rated or reviewed on the site can be recorded, mined and used to develop a recommendations engine.

Does a recommendations engine make sense for your business?

Do you have a lot of data waiting to be mined? (Or do you need to start capturing more data?)

Numtra can help.

  • We can evaluate your data – applicability and readiness for model development
  • We can co-develop a recommendations engine with you (or build one with minimal input)
  • We can deploy your model, recommendations engine, in a production environment scaled to your business, providing real time results.