The question “Are all Data Scientists nerds?” answered thanks to the Data Innovation Survey 2015

This article was originally published here

Although the Data Scientist has been declared the sexiest job of the 21st century by HBR and others, if we are honest, we need to admit that data scientists are still associated with nerds by the mainstream population. This data innovation survey was the perfect opportunity to me to investigate whether data scientists are really that nerdy as perceived by many.

I started this article by looking up some background information (after all, I do consider myself as a data scientist) on nerds. I found a very appropriate description on Wikipedia:

nerdNerd (adjective: nerdy) is a descriptive term, often used pejoratively, indicating that a person is overly intellectual, obsessive, or socially impaired. They may spend inordinate amounts of time on unpopular, obscure, or non-mainstream activities, which are generally either highly technical or relating to topics of fiction or fantasy, to the exclusion of more mainstream activities. Additionally, many nerds are described as being shy, quirky, and unattractive, and may have difficulty participating in, or even following, sports. Stereotypical nerds are commonly seen as intelligent but socially and physically awkward. Some interests and activities that are likely to be described as nerdy are: Intellectual, academic, or technical hobbies, activities, and pursuits, especially topics related to sciencemathematicsengineering and technology.

Does any of this sound familiar to you?

Let’s dive into the results of the data innovation survey, together with my best friend SAS Visual Analytics, to check if these stereotypes are true in the Belgian Data Science Landscape.

Stereotype n°1: All data scientists are young males

It probably doesn’t come as a surprise to you that the 87.2% of the respondents are male, but I’m glad to see that 36 other woman took the survey along with me. In terms of age, we do find a lot of youngsters, but the categories above 35 seem to be well represented too.

ds1
Note to the designer of the survey: next time please don’t foresee fixed age categories but let people type their real age if you want to see more interesting graphs than poor pie charts…

 Stereotype n°2: Data scientists are in front of their computer all night

Participants had nine days to respond to the survey. In the bar chart below you can see on which days the 289 respondents submitted the survey. We observe a clear pattern in the beginning of both weeks and strangely enough a drop towards Friday 13th… Maybe data scientists are more superstitious than they would like to admit?

Even more interesting to analyze are the times of the day when people took the survey. To my big surprise there’s a peak in the morning, so the Belgian data scientists seem to be early birds!

ds2

As we received the start time and the end time, I also calculated how long the average data scientist took to solve the questionnaire: 12.66 minutes, but the median data scientist had the job done in 10 minutes. We all remember our first statistics class: when the median is not equal to the mean, there is no symmetric distribution…

ds3ds4

Stereotype n°3: Data scientists are disconnected from the real world

If all data scientists are actually nerds, then they should all be quite “unworldly”. According to the Belgian Data Science survey, almost one third is working for a business organization or NGO with 7 777 employees worldwide on average, doesn’t sound that nerdy to me…

ds5

In total, 42% of the Belgian data scientists who took the survey are employed in the IT and technology industry. Ok, what else did you expect?

ds6

If data scientists were really that socially inadequate as what could be believed by some bad influences, ds7  they would never make it to a management position in their organization. And look, almost 55% our respondents have management responsibilities to a certain extent.

Stereotype n°4: All Data scientists hold a PhD in science or mathematics

Wrong again! Only 18.3% of the Belgian Data Scientists are holding a PhD degree. Although the majority graduated in science&math, ict or engineering, a significant amount completed commerce or social studies.

ds8

ds9

Stereotype n°5: All Data scientists are programming geeks and only use non-mainstream techniques

In part 6 of the survey, participants were asked to rate their skills with a score between 1 (don’t know this technique) and 5 (I’m a guru). It turns out that data scientists are not all guru’s in the newer techniques like big data and machine learning but are mostly familiar with traditional techniques like data manipulation (regexes, Python, R, SAS, web scraping) and structured data (RDBMS, SQL, JSON, XML, ETL).
ds10Although we observe some quite high correlations (between math & optimization 0.73, big data & unstructured data 0.67, …) it doesn’t necessarily mean that the scores are high on these topics. This is clearly illustrated with the heat maps below. On the left we have math and optimization which are highly correlated but with low scores, and on the right there is data manipulation and structured data with a moderate correlation of 0.42 but with the highest scores.

ds11 ds12

Stereotype n°6: All Data scientists are socially isolated and afraid to appear in public

The Belgian Data Scientists don’t only attend the monthly meetup meetings to learn about the new developments in Data Science or to hear what’s happening on the Belgian Data Science scene, but many of them also state social and networking reasons as motivation to get away from their pc to attend these meetings.

ds13

Stereotype n°7: There are clear role models for data scientists, they all look up to the same persons

Not that many respondents seem to be influenced by other data scientists in this world, as only a few of them answered this question with the name of a fellow data scientist and mostly different ones. For Belgium on the other hand, we do find two names that each appeared eight times among the answers. Congratulations to Bart Baesens and Philippe Van Impe, the Belgian Data Science guru’s!

ds14

Conclusion

The conclusion of the analysis of the Data Innovation Survey is as straightforward as simple: Data Scientist is the sexiest job of the 21st century! Unfortunately I’ll have to finish off here as my pole dancing class is going to start…

Advertisements

5 thoughts on “The question “Are all Data Scientists nerds?” answered thanks to the Data Innovation Survey 2015

  1. Pingback: Thank you for making the Data Innovation Summit a success | The Brussels Data Science Community

  2. Pingback: Data Innovation Summit – Satisfaction Survey | The Brussels Data Science Community

  3. Pingback: Top 5 presentations of DIS2015 (Data Science Innovation Summit). | The Brussels Data Science Community

  4. Pingback: Meet your community | The Brussels Data Science Community

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s