Data Scientists and Data Analysts have a number of skills unique to their positions, with analysts focusing more on creating business value and data scientists including simulation and optimization in their jobs.
By Ray Major, Halo Business Intelligence, July 2014.
The problem is that the term has a mystique where its features are hard to distinguish from the crowded taxonomy of different data terms. This revisionism is commonplace in many fields, but particularly in the fast growing data science field. I know many colleagues who re-branded themselves as data scientists in order to be more attractive in the market. They would argue that an analyst make reports while a data scientist makes visualizations, even if both have the exact same content.
There are a slew of other terms that get lumped in these categories and cause confusion when talking about statistics, business intelligence or data science, but none more elusive than the most nebulous of terms – Analytics.
The graphic below is an effort to help bring clarity to the analytics domain and helps develop a simple glossary for us to frame the actual work being performed.
The overlaps between business intelligence and data science are significant. In fact, a small majority of tasks and vast majority of labor are spent in tasks that are shared between the two related disciplines.
Here are some additional comments on the Infographic:
- Business analytics is a spectrum of different types of analytics with increasing complexity and business value from descriptive to prescriptive.
- The overlaps in analytics phases reflects finer points that would make the graphic less readable.
- Descriptive and predictive analytics reflect that some machine learning, like K-means clustering, are unsupervised, so there is no measurements of accuracy that can apply to the model’s output.
- Using supervised learning, like neural network, can benefit from using simulation testing or simple optimization such as a profit chart. More advanced use cases fall squarely into data science, like using simulation as inputs for predictive models which can drive optimization.
- The statistics in the Data Mining sphere refers to basic quantitative analysis, like cumulative distribution and correlation.
- Decision Management is the combination of business rules and processes to augment the business value of the analytic output.
While this graphic may not be enough to update your title from “analyst” or “business intelligence developer” to “data scientist” (though you would not be the first), it certainly helps give boundaries to very elusive labels.
Ray Major is the Chief Strategist of Halo Business Intelligence. A data scientist, economist and statistician by training, he’s a life-long practitioner in the mysterious arts of data intelligence and analytics. You can contact Ray by email at firstname.lastname@example.org and also follow him on Twitter at @majorbi