Blog – Predictive Analytics – a Soup Story by Geert Verstraeten

geert 1brasserie octopus

Predictive Analytics – a Soup Story

A simple metaphor for projects in predictive analytics 

By: Geert Verstraeten, Predictive Analytics advocate, Managing Partner and Professional Trainer, Python Predictions

The analytical scene has recently been dominated by the prediction that we would soon experience an important shortage of analytical talent. As a response, academic programs and massive open online courses (MOOCs) have sprung up like mushrooms after the rain, all with the purpose of developing skills for the analyst or its more modern counterpart, the data scientist. However, in the original McKinsey article, the shortage of analytics-oriented managers was predicted to become ten times more important than the shortage of analysts[1]. But how do we offer relevant concepts and tools to managers without drowning our ‘sweet victims’ in technology and jargon?

For managers, most analytics training falls short in a critical way. The vast majority of newfound analytics training focuses on core analytics algorithms and model building, not on the organizational process needed to apply it. In my opinion, the single most important tool for any manager lies in understanding the process of what should be managed. The absolute essence when asked to supervise predictive analytical developments lies in having a solid understanding of the main project phases. Obviously, we are not the first to realize that this is vital. Tools have been developed to describe the process methodology for developing predictive models[2]. However, it is difficult for non-experts to become excited about these tools, as they describe phases in a rather dry way.

We have experimented with different ways to present process methodology in a more fun and engaging way. Today, we no longer experiment. In our meetings and trainings with managers, we present the development of analytical models as simple as the process of making soup in a soup bar.

Project definition

geert phase 1  This first phase is concerned with understanding the organization’s needs, priorities, desires and resources. Taking the order basically means we should start by carefully exploring what it is that we need to predict. Do we want to predict who will leave our organization in the next year, and if so – how will we define this concretely? At this time, when the order becomes clear, it is time to check the stock to make sure we will be able to cook the desired dish. This is equivalent to checking data availability. Additionally, it is important to have an idea about timing: will our client need to leave timely in order to catch the latest movie? This is pretty similar to drawing a project plan.

Data preparation

geert phase 2The second phase deals with preparing all useful data in a way that they are ready to be used subsequently in the analysis. For those not familiar with (French) cooking jargon, mise en place is a term used in professional kitchens to refer to organizing and arranging the ingredients (e.g. freshly chopped vegetables, spices, and other components) that a cook will require for his shift[3]. Data are for predictive analytics what ingredients are for making soup. In predictive analytics, data are gathered, cleaned and often sliced and diced such that they are ready to be used in a later analytical stage.

Model building

gert phase 3The main task in cooking the soup lies in choosing exactly those ingredients that blend into a great result. This is no different in predictive modeling, where the absolute essence lies in selecting those variables that are jointly capable of predicting the event of interest. One does not make a great soup with only onions. Obviously, not only the presence of ingredients is relevant, also the proportions in which they are used – compare this to the parameters of predictors: not every predictor is equally important for obtaining a high quality result. Finally, cooking techniques matter just as much as algorithms do in predictive analytics – they represent essentially different ways to combine the same data into the best soup.

Model validation

geert phase 4In cooking it is crucial to taste a dish before it is served. This is very similar to model validation in predictive model building. Both technical and business relevant measures can be used to objectively determine whether a model built on a specific data set will hold true for new data. As long as the soup does not taste well, we can iterate back to cooking, until the final soup is approved – i.e. the champion model is selected.

Model usage

geert phase 5This phase is all about presentation and professional serving. A great soup served in an awful bowl may not be fully appreciated. The same holds true for predictive models – a model with fantastic performance may fail to convince potential users when key insights are missing. Drawing a colorful profile of the results may prove instrumental in convincing the audience of the model’s merit. If done successfully, this will likely result in an in-field experiment, for example designing a set of retention campaigns targeting those with the highest potential to leave. At that point, the engaged analyst should check in whether the meal is enjoyed.

Conclusion
title

This simple, intuitive process has been important to us to allow managers to engage in the process in a fun way. Presenting the process in a non-technical way makes the process digestible (to be fair, I’ve stolen this phrase from my friend Andrew Pease, Global Practice Analytics Lead at SAS because it makes such great sense in this context). However, it should remain clear that it is only a metaphor. At some point, building predictive models is obviously also different that making soup. Every phase, especially project definition, involves many more components than those where a link with soup can be found. But the metaphor gets us where we want to be – a point where a discussion is possible on what is needed to develop predictive models, and where a minimum of trust can exist: it ensures that we get on speaking terms with decision makers and all those who will be impacted by the models developed.

Notes and further reading

brasserie octopusWe fully realize this is not completely different from CRISP-DM, the Cross Industry Standard Process for Data Mining, which has been developed in 1996, and is still the leading process methodology used by 43% of analysts and data scientists. However, except if you are a veteran and/or an analyst, it is difficult to get really excited about CRISP-DM or its typical visualization. For those looking for a more in-depth understanding of the process, I recommend reading the modern answer to CRISP-DM, the Standard Methodology for Analytical Models (by Olav Laudy, Chief Data Scientist, IBM).

[1] In a previous post, we have also argued that the analytics-oriented manager is main lever for success with predictive analytics.

[2] for the sake of clarity: a predictive model is a representation of the way we understand a phenomenon – or if you will, a formulaic way to combine predictive information in a way to optimally predict future behavior or events.

[3] see the Wikipedia definition of mise en place

About Geert

Geert VerstraetenGeert Verstraeten is Managing Partner at Python Predictions, a niche player in the domain of Predictive Analytics. He has over 10 years of hands-on experience in Predictive Analytics and in training predictive analysts and their managers. His main interest lies in enabling clients to take their adoption of analytics to the next level. His next training will be organised in Brussels on October 1st 2015.

 

Gratitude goes to Eric SiegelAndrew Pease and our team at Python Predictions for delivering great suggestions on an earlier version of this article. All remaining errors are my own.

Link to the next training details from Geert.

Video

Advertisements

Book – Fraud Analytics by Veronique, Bart and Wouter available on Amazon

bart en veerle 2Fraud Analytics

Using descriptive, predictive and social network techniques.

by Veronique Van Vlasselaer, Bart Baesens and Wouter Verbeke.

We are please to announce that the book about Fraud Analytics is now available for purchase on amazon.

Here is the full video of this presentation:

bart en veerlebart en veerle 1

The essence of Predictive Analytics for Managers – Training – Oct 1st 2015 – @Data Innovation Hub

ranking octopus

The essence of Predictive Analytics for Managers

Defining and managing projects in predictive analytics

Target audience

Managers of analytical teams, Managers of functional departments (marketing, risk, operations, HR,…), Project managers, CXO.

Details

  • Duration: One afternoon workshop (4h):
    • October 1st, 2015
  • Location: European Data Innovation Hub @ AXA, Vorstlaan 23, 1170 Brussel
  • Price: 570€ per manager
  • Limited from 8 – 12 participants

Registration:

Please register using Eventbrite following this link.

Motivation

Fueled by the energy around Big Data projects, an increasing number of managers are attracted to the domain of analytics. When successfully applied, analytics provides the key to turn data into big value. But how do we ensure organisations reap the maximum return on their investments? How to increase the impact advanced analytical teams have on their organisations? This training provides a backbone for managing projects in Predictive Analytics that maximally impact the organisation. Additionally, this training establishes the foundation for fruitful collaboration between analysts, their peers and decision makers.

Overview

Applying predictive analytics has the potential to bring a huge impact to an organisation. Typical goals of such high-impact projects involve:

  • (i) increase targeted marketing success by predicting response
  • (ii) increase marketing relevance by offer personalisation,
  • (iii) decrease risk exposure by predicting credit or fraud risk,
  • (iv) increase process efficiency,
  • (v) retain crucial staff members, etc.

In this training, we focus on the key elements needed for managers to obtain success through predictive analytics.
The workshop is designed as an interactive learning experience packed with best practices illustrated with real domain experience.

Learning objectives

After the training, participants will be able to define and manage developments in Predictive Analytics. In practice participants will be able to:

  • define a project in predictive analytics in detail & understand:
    • a simple process useful for managing the development of predictive models
    • the intuition of each step in the development process
    • the underlying principles for predictive analytics
    • the requirements and limitations of predictive analytics
    • how to manage and increase usage of predictive analytics

Topics

  1. what is predictive analytics
  2. basic underlying principles for predictive analytics
  3. requirements for predictive analytics
  4. limitations of predictive analytics
  5. a reliable process for managing and building predictive models

Prerequisites

Before the start of the first session, participants should attempt to define a practical and relevant project within their organisation. At completion of the workshop, it is the aim that managers will be able to further define and understand the steps needed to manage this challenge to success.

About the trainer

Geert Verstraeten

Geert Verstraeten (PhD) is a dynamic trainer with a solid background both in predictive analytics and in professional training. Geert has 14 years of experience in building predictive models for organisations in a wide array of industries. Additionally, he has 14 years of academic and industry experience
in training and coaching managers and analysts to succeed with Predictve Analytics. Since 2006, he is managing partner at Python Predictions (www.pythonpredictions.com), a Brussels-based niche player in Predictive Analytics, and was involved in building analytical communities both in Belgium (www.BAQMaR.eu) and abroad (www.pawcon.com/london/). Geert is a frequent speaker at academic and business conferences in analytics. Since 2014, Geert is a certified professional trainer (http://www.hrdacademy.be/).

Certification

brasserie octopus

Attendees receive an electronic version of the handouts and a proof of participation at the conclusion of the workshop.

More information


Here is the link to a nice write up about this executive class.

Video