As data becomes increasingly cheap to collect and store, companies are relying more heavily on data analytics and machine learning to make business decisions. The applications of machine learning are growing rapidly, with the principal areas of application listed in an excellent blog by Nikki Castle – “6 Common Machine Learning Applications for Business”. Our blog builds on each of these categories to illustrate how machine learning applications can add value to a business by increasing agility and knowledge of the customer base.
It is often useful to segment customers into different groups for marketing campaigns. Although marketers generally have good intuition for this, machine learning algorithms have been shown to do a better job at clustering or classifying customers by purchasing behavior. Demographics such as age, race, and gender, as well as online browsing habits, can serve as features for the model. It is naive to simply segment customers by arbitrary features (i.e., white males ages 30-35, and call that one segment), when instead a machine learning algorithm is able to determine the linear combination of features which best segments customers into groups such as frequent, sporadic, or infrequent shoppers. We could also consider binning customers by total dollars spent annually at the company. Appropriate customer segmentation also helps us better understand larger issues, such as customer churn and lifetime value.
The highest-value customers at a company are the most important for the company to understand. Models that predict future revenue that individual customers will bring to your business in a given period are invaluable. The customer lifetime value (CLV) is a central issue for any commercial endeavor, and is generally defined as the discounted value of future profits generated by a customer. Because individual costs associated with each customer are typically easy to predict, machine learning is used predominantly to model the revenue side of the CLV. The data used to generate these models can come from a variety of sources, but is typically in the form of transactional and behavioral data, which are easy to collect in the modern online world. At successful companies, CLV informs marketing decisions at every step of the customer lifecycle.
Some models rely on a heuristic approach, where past purchase behavior and recent activity (or inactivity) of a customer is used as a predictor for future behavior. This approach has its limitations; it classifies customers who haven’t purchased a product in a given period (e.g. 6 months) as inactive, when this may be typical behavior for that individual. A probabilistic approach gets around this by instead assigning each customer a probability of making another purchase in the stated time interval. Because different customers generate different patterns in successive purchasing events, each unique customer is modeled separately using probabilistic, generative models. Purchasing behavior, lifetime, and monetary value serve as the three latent parameters for such models.
Recommendation engines calculate similarity metrics to measure how closely related products are to one another. They then use customer shopping history and/or product reviews to obtain a metric for customer taste. The product-product similarity matrix, along with the customer-product “affinity” matrix, are used by the model to predict products the customer will like, but has not yet seen.
Powerhouse recommender systems built at Netflix and Amazon are said to provide $1 billion dollars in content suggestions per year, and drive a 20-30% lift in sales annually.
Acquiring new customers can be several times more expensive than retaining existing ones. It is therefore important for businesses to gain an understanding of their customer base and their reasons for leaving the company when applicable. Some customers switch to a competitor, others opt to leave the market altogether. Understanding customers who choose the former exposes marketing weaknesses that competitors may lack, while understanding customers who choose the latter exposes new markets where the company can expand, and perhaps suggests advertising techniques to incentivize customers to re-enter the marketplace.
The churn rate is calculated by dividing the number of customer cancellations in a given time period by the number of active customers at the start of that period. Most models simply predict churn by engineering features that are thought to contribute to it. Some models go as far as to predict the type of churn a customer may experience, or build a different model for each type. There are four basic types of churn: Contractual churn occurs when customers involved in an ongoing payment contract, such as autopay, discontinue service, while Non-contractual churn allows customers to freely leave the marketplace at any time by simply not making new payments. Voluntary churn occurs when customers make a conscious choice to leave, and Involuntary churn is a termination of service at a predetermined date, such as expiry of a credit card. These types are, of course, not mutually exclusive. It should also be noted that for non-contractual businesses, the challenge in modeling churn lies in defining a clear churn event.
Business insights can be gained from predictive churn models; output probability scores for each customer may be used to inform retention campaigns, targeted discounts, or target e-mails. Feature importance can also be extracted from the model, giving us an understanding of the shared characteristics for customers who churn, and insights into their motivation.
Computer vision and artificial intelligence are growing fields with increasing business applications, and online retailers are beginning to rely more on similarities in product images to make recommendations. This is particularly useful in the fashion industry, where recommendations can be made for articles of clothing that look similar to those in the customer’s purchase history. Facial recognition, a specific application of image classification, is one of Facebook’s most popular features, and is also used for security (e.g. to analyze surveillance footage of an ATM after a robbery, to detect counterfeit IDs and passports – and with facial recognition being used to unlock cellphones, it has already become a regular part of the lives of many people). Recent advancements in computer vision have also made autonomous driving a reality, with autonomous vehicles having a larger field of view than any human driver. Image data is a vast resource and perhaps the most promising source of untapped new business value and customer insight, and complex data, whether images, video, audio – or forms of data we have not yet envisaged – will challenge data science and its practitioners for many years to come.
For more information on the uses of data science in business, Bernard Marr’s 2015 blog at Data Science Central provides a high-level categorization of business aspects which benefit from big data analyses, while a much finer-grained listing of use cases is available from Kaggle.