Event – Official Opening of the European Data Innovation Hub in Brussels – October 20th 10AM

HUBdatainnovation invitation

We are so pleased to announce that our Hub will officially be inaugurated by Alexander De Croo, Bianca Debaets, Frank Koster, Marietje Schaak and Jörgen Gren. Over 100 directors and managers have already confirmed their presence to this ribbon cutting ceremony that will be held in our new offices in the Axa building.

266px-Alexander_de_croo_675 bianca-in-brussel frank marietjejorgen_gren

Join us on October 20th at 10AM to meet young data startups, talk to representatives of the academic world, share your ideas with the politics representatives that support us and discover our datascience training offering.

There are only a few more spaces left so please hurry and reserve your seat via our eventbrite page.

I’m looking forward to meeting you in our new offices soon,

Philippe Van Impe
asbl European Data Innovation Hub vzw 
Inspire – Innovate – Connect

Datasciencebe comes second in the study from @marc_smith using NodeXL SNA Map

Selection criteria: data science OR #datascience Twitter NodeXL SNA Map and Report for Tuesday, 08 July 2014 at 17:54 UT

From: marc_smith,  Uploaded on: July 08, 2014
Description:
  • The graph represents a network of 6,564 Twitter users whose tweets in the requested range contained “data science OR #datascience”, or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Tuesday, 08 July 2014 at 17:59 UTC.
  • The requested start date was Tuesday, 08 July 2014 at 23:59 UTC and the maximum number of tweets (going backward in time) was 10,000.
  • The tweets in the network were tweeted over the 17-day, 1-hour, 40-minute period from Friday, 20 June 2014 at 21:48 UTC to Monday, 07 July 2014 at 23:28 UTC.
  • There is an edge for each “replies-to” relationship in a tweet, an edge for each “mentions” relationship in a tweet, and a self-loop edge for each tweet that is not a “replies-to” or “mentions”.
  • The graph is directed.
  • The graph’s vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.
  • The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.
  • The edge colors are based on edge weight values. The edge widths are based on edge weight values. The edge opacities are based on edge weight values. The vertex sizes are based on followers values. The vertex opacities are based on followers values.

Overall Graph Metrics:

  • Vertices: 6564
  • Unique Edges: 7487
  • Edges With Duplicates: 4294
  • Total Edges: 11781
  • Self-Loops: 5169
  • Reciprocated Vertex Pair Ratio: 0.0284219703574542
  • Reciprocated Edge Ratio: 0.0552729738894541
  • Connected Components: 2411
  • Single-Vertex Connected Components: 1890
  • Maximum Vertices in a Connected Component: 3054
  • Maximum Edges in a Connected Component: 7070
  • Maximum Geodesic Distance (Diameter): 19
  • Average Geodesic Distance: 5.396585
  • Graph Density: 0.000136909565312827
  • Modularity: 0.537045
  • NodeXL Version: 1.0.1.331

Top 10 Vertices, Ranked by Betweenness Centrality:

  1. kirkdborne
  2. datasciencebe
  3. kdnuggets
  4. analyticbridge
  5. jackwmson
  6. wsj
  7. datasciencedojo
  8. coursera
  9. zeynep
  10. data_nerd

“Do you need a Masters Degree to become a Data Scientist?” Read practical tips and interesting commentary.

Leading analytics experts answer the question: “Do you need a Masters Degree to become a Data Scientist?” Read practical tips and interesting commentary.

By Gregory Piatetsky, @kdnuggets, Jun 27, 2014.

KDnuggets LinkedIn GroupKDnuggets Analytics, Data Mining, and Data Science LinkedIn Group has many active discussions, and recently one such discussion was prompted by a question from Alok Sharma:

Is it necessary to have Masters Degree to become a Data Scientist? Or are there any certificate courses that can help me to become a Data Scientist?

Data Scientist - a Unicorn?This discussion is now going for 4 months, and got responses from many leading data scientists and professors, including Mark A. Biernbaum, Goutam Chakraborty, Michael Fahy, Myles Gartland, Vincent Granville, Daniel Dean Gutierrez, Steven Miller, Greta Roberts, and myself.

The consensus seems to be that good practical skills can take the place of a MS degree, but there are many interesting comments and practical tips – see below.

In case you are interested in Masters, here are Analytics and Data Science Education options, including

Here are selected and most interesting answers from the discussion in answer to the question:
Is it necessary to have Masters Degree to become a Data Scientist? Or are there any certificate courses that can help me to become a Data Scientist?

Steven Miller, Data Maestro, Talent & Skills Ecosystem at IBM
To learn data science — absolutely. A few schools are building undergraduate programs that will be akin to a computer science degree. You will learn core skills, but they won’t make you a scientist who will be advancing the field.

Sal DiStefano, Developer at Restaurant Technologies, Inc.
Remember that Data Scientist is just a Title. (A media hyped title) Some give themselves or have this title because that’s the work they do, not because they have a particular degree. Some may hold degrees in Statistics, Mathematics, Computer Science, the disciplines vary.

You can learn data science anywhere. No single Masters Program could cover all the disciplines needed in significant depth for one to be an expert in all these areas. Selecting an area or two or three and having depth and expertise in those is common. Many companies do not have just a “Data Scientist” but teams comprised of experts from the different disciplines.

While some institutions are offering or creating Masters Programs with this title, most of the current field of Data Scientist have no such Degree. Check out the following linken.wikibooks.org/wiki/Data_Science:_An_Introduction/A_Mash-up_of_Disciplines to see a list of disciplines considered within the Data Science area.

The new Columbia Data Science Institute is offering a new Masters in Data Science Program idse.columbia.edu/masters as well as a certificate program for those who already hold advanced degrees.

There is the Johns Hopkins University Online Data Science Certificate program available on Coursera bit.ly/1dTkXju,

Experience and doing in my opinion is the best way to become a “Data Scientist”. There are many ways to do this with or without an advanced degree program. There are many doing great “Data Science” work under other titles.

Joyce Crum, MBA, P.E., Operations & Business Manager
I agree, Data Scientist is just a title. A good Masters to get you in the arena is Operations Research. My experience was, I had to understand the system to apply the math or tools, not just how to do the math or work the tool. Good luck.

Gregory Piatetsky-Shapiro, Analytics/Data Mining Expert, KDnuggets President
There are many analytics certificates – seewww.kdnuggets.com/education/analytics-data-mining-certificates.html . I also recommend you take part in some Kaggle competitions – a good result there shows your competence (but it is not easy – competition is stiff!).

Greta Roberts, Co-founder and CEO, Talent Analytics, Corp.
You may want to check out the hands on / practical / certificate that comes with a SAS certification at the end (a very valuable certification). All done online – in your own time. math.kennesaw.edu/academics/certificate/sas-dm/index.html

Nidhi Kohli, Online Marketing, content writer at Jigsaw Academy
Big data knowledge is not very difficult to obtain, and anyone with some needed pre-requisites like existing knowledge of statistics, programming and databases concepts can become a big data professional

Based on job requirements, the skills in most demand are Hadoop/Big Data, tools including R and SAS, and some domain knowledge. Theory is assumed as a prerequisite, but usually good data selection and engineering is more important than advanced algorithms.

However, there is a strong demand for analytic talent and a shortfall in supply. If you have a master’s degree, it will be add on for you but if you don’t have, many companies will overlook this as long as you have the right skills.

Please check the below link for India’s top 10 Analytics institutes.education.sulekha.com/top-10-analytics-training-institutes-in-india_602070_blog

Daniel Dean Gutierrez, Data Scientist at AMULET Analytics
I was at a Meetup last night where a guy, a “programmer for 31 years,” said that a last year he decided to call himself a data scientist. He wanted to take advantage of the new hype. He said it was really easy to become a data scientist. He started by taking Andrew Ng’s class in machine learning through Coursera, but was “destroyed” by the class and had to drop. Then he took the Coursera “Computing for Data Analysis” class, 4 weeks to basically learn R. Then he took an expensive Data Science on-premise classes. And voila! A data scientist is born.

Having an academic background in data science, I’m hard pressed to call this gentleman a data scientist. I think it takes more than a couple MOOC classes and and more time to take on that moniker.

Gregory Piatetsky-Shapiro, Analytics/Data Mining Expert, KDnuggets President
The person Daniel refers to may be a good data analyst/coder, but not yet a Data Scientist. Knowledge of R , Python or other tools is secondary to knowing how to approach the data, how to ask right questions, and good intuition about what works and what not. Those skills are critical to a good data scientist, but take more than a few weeks and

Jim Lola, Entrepreneur, Sr. Manager, Technologist, Architect, & Author
So what if you have undergrad degrees in History, Statistics, and CS, and then advanced degrees in CS and/or SE. Experience doing actuarial analysis, financial fraud analysis, failure analysis, BioStatistics, HUMINT, OSINT, and organizational theory analysis. And experience developing software for HPC systems, information management systems, DBMS (so a lot of information theory app), etc. in languages like C/C++, PHP, Java, Python, R, and Julia. And, of course, a natural curiosity on how things work and the ability to hire and manage other folks who also have a passion for information. And finally, have run a business and made business decisions. So does someone like this qualify as a Data Scientist? Just curious…

Matthew O’Connor, LTC Analyst
I think people are really getting way too hung up on the term “data scientist”. Due to it being completely nebulous as to what exactly it is at this point I would say it is merely a buzzword that may or may not end up getting a hard definition in the future. Speaking as someone who has minimal experience in the field and currently enrolled in Northwestern’s MSPA program, I can confidently say that when I complete my degree I will NOT be a data scientist (if I had years of experience prior to beginning the program I might be singing a different tune though). However, I do believe it will give me a solid foundation to build upon.

In the end, I believe it requires a combination of years of experience and a relevant degree. Furthermore I would say a certain amount of aptitude is also required. I’m sure there are people with a B.S. in Computer Science that have been doing analysis for over ten years that can and should be called data scientists just as there are PhD’s out there who should not.

So to conclude I believe it is wise for newcomers like myself just to become excellent at data analysis/mining and not worry about monikers – those will come naturally once a certain degree of success has been achieved. That’s my two cents at any rate.

Daniel Dean Gutierrez, Data Scientist at AMULET Analytics
Jim, what you describe is a “unicorn,” something many companies are seeking when hiring a data scientist.
Matthew … as one long-time “data scientist” I love the new term for what I do. I think it aptly describes what I do, what I’ve always done, with data.

Vincent Granville, Data Scientist, Startup Entrepreneur
Masters will eventually change and adapt. I wouldn’t be surprised that some organizations/companies will soon offer a solid master, at almost no cost, online and on-demand. We are actually working on delivering such a high-quality training to practitioners with a quant background. The idea is to help interested candidates acquire all my useful experience and knowledge gathered over my 25 years career, spanning across multiple continents and various data science roles (Visa, Microsft, eBay, Wells Fargo amd start-ups), in a compact format delivered online on-demand in less than six months.

Aatash Shah, Founder & CEO at Edvancer Eduventures Pvt. Ltd.
One doesn’t become a data scientist overnight. You need to take it step by step especially if you are new to the whole analytics/data science show. To become a data scientist you need to be hands-on on various tools and technologies and these vary right from the basic MS Excel and SQL to statistical software like SAS/R/SPSS, languages like Python, Perl, C++, Java etc. and technologies that can handle Big Data like the Hadoop ecosystem. Apart from this you would be expected to have good knowledge of business to be able to eventually bring out the insights from the data.

To take a step by step approach an expensive Master’s degree may not be the best solution as no degree will eventually cover all these requisites and there will always be newer tech coming up. The best way would be to take a modular approach in learning all this stuff through short, inexpensive certificate courses. After all it is the knowledge which matters and certificate courses probably provide more hands-on, practical knowledge at a cheaper price than a Master’s.

Check out some certificate course providers in analytics here:analyticsindiamag.com/top-8-analytics-training-institutes-in-india/

Alok Sharma, Programer Analyst at BitWise Inc
Thank You Sal, Greg, Daniel, Vincent, Atash & others. Your Comments were really insightful. I think I will start off by taking up Data Analysis courses on Coursera followed by industry relevant course in analytics and work on developing my knowledge until I land up to a relevant job.

Myles Gartland, Professor and Director of Graduate Business Programs at Rockhurst University; Chief Analyst at Insightful Analytics
To me it is also like asking do I need have a graduate degree to be a CEO. Well, no. A data scientist does not require any licensure- so technically you need so specific credential. Your degree usually gets you in the door, and your skills let you keep and excel at your job. All that said, you do not need a graduate degree to DO the job, but you might need one to GET the job (look at many of the job postings and their requirements).

Vincent Granville, Data Scientist, Startup Entrepreneur
If you don’t need to sell something to someone (a real human – like selling yourself to get a job), but instead generate revenue via automated data science systems that do not require human interactions (stock trading, various arbitraging systems including keyword bidding, sport bets, data science publisher generating revenue via Google Adwords, some types of hacking), then you don’t need any diploma or certifications. Not even high school, not even primary school.

Goutam Chakraborty, Professor (Marketing) at Oklahoma State University and Management Consultant
This is a great discussion. I have taught, advised, counseled more than 500 students (in last 10+ years) of our graduate certificate program in data mining at Oklahoma State University (analytics.okstate.edu). Most of our students work in the field of data mining, predictive analytics, marketing analytics, web analytics, marketing science, data science ….. After having talked to 100’s of major corporations and employers, I feel a data scientist (wanted by a company) is someone with a “multiple personality disorder” who can still function well! This person has knowledge and abilities in

1. Programming (SQL, Python, Java, …..) and exposure to big data via Hadoop, MapReduce…
2. Statistical and numerical models along with ability to do visualization and optimization (using multiple software platforms including proprietary such as SAS and open-source such as R)
3. Have domain expertise to understand how all these apply in the context of a business to cerate value
4. A good communicator so that the person can explain the models to users who are unlikely to accept the models if they do not understand them
5. A person with curiosity, determination, team-player, leader… you name it.

So, can you develop all these skills in one course? or a short program? NO!

How about through a series of well-designed courses (not 1 or 2 perhaps 4 or more spread over a span of 1-2 years so you have time to assimilate knowledge and put those to work) that build on each other plus hands-on experience in working with complex data and models – Yes (that is what we do at our graduate certificate program for working professionals taking courses via online). But, of course, I am biased because I run the program.

Atul Thatte, Senior Manager, Advanced Transaction & Consumption Analytics

I completely agree that “Data Scientist” is just a title. A Master’s degree can be beneficial, however, a set of courses that provide a balance between theoretical depth and practical breadth would be ideal. I would highly recommend Dr. Chakraborty’s Graduate Data Mining Certificate program at Oklahoma State University. Having completed it myself, I know first hand that the program provides a solid theoretical foundation, a lot of experience with publishing/presenting in industry fora, and competing in industry sponsored competitions such as the annual SAS analytics shootout. It is available online, and so its quite practical for full time professionals. Hope this helps answer your question.

Michael Fahy, Associate Dean, School of Computational Sciences, Chapman University
You need a combination of Mathematics, Statistics and Computer Science.

Here is a sample list of courses from our MS degree in Computational and Datat Sciences at Chapman University:

  • CS 510 Foundations of Scientific Computing
  • CS 520 Mathematical Modeling
  • CS 530 Data Mining
  • CS 540 High-Performance Computing
  • CS 555 Multivariate Data Analysis
  • CS 595 Computational Science Seminar
  • CS 611 Time Series Analysis
  • CS 612 Advanced Numerical Methods
  • CS 613 Machine Learning
  • CS 614 Interactive Data Analysis
  • CS 615 Digital Image Processing

Myles Gartland, PhD, Professor and Director of Graduate Business Programs at Rockhurst University; Chief Analyst at Insightful Analytics
to pile on to Michaels and Daniel’s list- lets not forget context, domain and communication. A few classes in communication and basic business (assuming they will work for one) will go a long way too. People sitting in cubes writing models without understanding of businesses questions and ability to communicate their results lose some of their value.

Mark A. Biernbaum, PhD Researcher- 25 years experience; Children’s Institute, Clinical/Social Psychology, University of Rochester
The current crop of Data Scientists has learned the work on the job. A lot of great research work is learned on the job. A Master’s or credential program could create problems for the person obtaining them once they get on the job and find that the work is as much experience as it is education.

Vincent Granville, Data Scientist, Startup Entrepreneur
If you are a self-funded entrepreneur, you don’t even need a primary school education, not even kindergarden: your education and job title do not matter, only your capacity to generate value and profits. In my case, though having a PhD, I never mention my degree – nobody (except time-wasters) is asking anyway. Indeed, ignoring people asking about my (real!) credentials has been one of the best strategies to stop wasting time in discussions going to nowhere.

Romakanta Irungbam, Analytics Consultant, Predictive Modeler
Except for companies that have just started doing a few things in analytics and dream of a hiring a non-existent master of all trades, experience is a more valuable thing than a Masters or any other degree.

When I started out sometime in 2003 (in India), it was called Research and Analytics because most of the analytics work were linked with marketing research studies/surveys. Quantum and SPSS were very popular software then, while sampling and significance tests were the most used analytical techniques. I learned all of these on the job.

Sometime around 2007, I had to learn SAS programming because the new company I joined, uses SAS for all their analytics projects. I also learned techniques like logistic regression, cluster analysis, and factor analysis at this company.

During 2009 till 2012, I worked on Netezza and Teradata to extract and acquire the data I needed for my predictive modeling and other analytical projects. While I continued using SAS, I had to learn SPSS Modeler because one of my biggest clients uses SPSS. I also became very good in a lot of statistical/data mining techniques – Decision Trees, Regression Models, Time Series, Mixed Models, etc.

And finally in my current role (starting 2012), I learned MS SQL and Tableau. I also did my first and very challenging SAS Macro programming which is more than 800 lines of code and will accomplish the same tasks that used to take weeks, in about a day.

All the software/programming and the statistical/data mining techniques were not learned in college or a formal coaching environment. Most of the times, it was searching and reading on Google, a good discussion with the team, and in a few cases, a colleague or someone senior who will help when the right questions were asked. What I want to say is – the media, some companies and their ‘executives’ try to make everything sound very technical and critical/essential – R, Hadoop, Big Data, MongoDB, Deep Learning, etc,. etc,. – but at the end of the day, you can and will learn anything when there is a requirement. And sometimes, there will be someone who will push you into the water, someone who will make you take the first step when you have your doubts.

So, my answer is no. 🙂

33 most noted Data Scientists on Twitter by Baiju NT

33 most noted Data Scientists on Twitter

Jun 27, 2014  by Baiju NT

Do you follow these 33 most notable Data Scientists on Twitter?

1. Hilary Mason @hmason

Data Scientist in Residence at @accel. I ♥ data and cheeseburgers.

2. John Myles White @johnmyleswhite

Scientist at Facebook and Julia developer. Author of Machine Learning for Hackers and Bandit Algorithms for Website Optimization. Tweets reflect my views only.

3. Peter Skomoroch @peteskomoroch

Creating intelligent systems to automate tasks & improve decisions. Entrepreneur, ex Principal Data Scientist @LinkedIn. Machine Learning, Product, Networks

4. Gregory Piatetsky @kdnuggets

KDnuggets President, Analytics/Big Data/Data Mining/Data Science expert, KDD & SIGKDD co-founder, was Chief Scientist at 2 startups, part-time philosopher.

5. Ryan Rosario @DataJunkie

Machine Learning Engineer at Facebook: statistician & computer scientist. Natural language processing, network analysis, modeling. UCLA Bruin. Opinions my own.

6. DJ Patil @dpatil

Building products at RelateIQ – http://www.relateiq.com

7. Jeff Hammerbacher @hackingdata

@techammer @hammer_lab @cloudera

8. David Smith @revodavid

Blogger, R evangelist and Chief Community Officer at Revolution Analytics — david@revolutionanalytics.com

9. Christopher D. Long @octonion

JMI data scientist and thoroughbred analyst. Consultant for the Houston Rockets and an unnamed MLB team.

10. Carla Gentry @data_nerd

Data Scientist, Data_Nerd Founder Analytical-Solution. What can your data do for you? Measure, Segment, Research and Data Analysis – keys for increasing ROI

11. Ben Lorica @bigdata

Chief Data Scientist @OReillyMedia. Director of Content Strategy @strataconf. Aspirant Parisien. Every Sunday is a Hack Day.

12. Siah @siah

Just finished a PhD at Berkeley. Silicon Valley Scientist. Machine Learning, HCI, Data Science and Statistics. Addicted to coffee and in love with @marjoo

13. Ferenc Huszar @fhuszar

VC data scientist at @Balderton. Previously at @PeerIndex

14. Drew Conway @drewconway

Data nerd, hacker, student of conflict.

15. Michael Wu Ph.D. @mich8elwu

Scientist: Big Data, Gamification, Influence, Predictive Social Analytic, Cyber Anthropology, Social Network Analysis, Machine Learning, Community Dynamics

16. Matt Wood @mza

Amazon Web Services

17. Olivier Grisel @ogrisel

Datageek, engineer @Parietal_INRIA, contributor to scikit-learn. I like Python, NumPy, Spark & interested in Machine Learning, NLProc, {Big|Linked|Open} Data.

18. Josh Wills @josh_wills

Data Scientist and Apache Crunch committer. I mostly tweet about #hadoop and postmodern lit. Yeah, I know.

19. John Foreman @John4man

Chief Data Scientist @MailChimp. My new book Data Smart (http://amzn.com/111866146X ) is out from @WileyTech. http://john-foreman.com

20. Jake Porway @jakeporway

Believer in tech+data for beautiful purposes || Director @DataKind || TV nerd @NatGeoChannel || Fiercely optimistic

21. Andrew Ng @AndrewYNg

Chief Scientist of Baidu; Chairman and Co-Founder of Coursera; Stanford CS faculty. #machinelearning, #deeplearning #MOOCs, #edtech

22. Eric Xu (徐宥) @mathena

Data Scientist at Google. Things I enjoy: arts, cooking, martial arts, musicals, painting, philosophy, skateboarding and Zen.

23. Monica Rogati @mrogati

Data @ Jawbone. Turned data into stories & products at LinkedIn. Text mining, applied machine learning, recommender systems. Ex-gamer, ex-machine coder; namer.

24. P. Oscar Boykin @posco

Human @ Twitter, runner, Programming, Hadoop, Scala, co-author of @scalding, @summingbird

25. Benedikt Koehler @furukama

Backpacker Data Scientist • Passionate about Number Crunching, Machine Learning & Big Data • Food & Craft Beer • Hadoop, Python & R • Soc PhD • Dad & Dog Owner

26. David Gutelius @gutelius

Founded the @dataguild. I like startups, data of all sizes, economics, network dynamics, adventure, national insecurities, and improving things.

27. Marck Vaisman @wahalulu

Data Scientist & consultant. Master munger and hacker. Data Community DC Co-Founder. Dad. Geek. Kid at heart ¡venezolano con orgullo!

28. Andreas Weigend @aweigend

Social Data Lab

29. Amy Heineike @aheineike

Director of Mathematics @Quidlabs. Bring on the algorithms!

30. Sebastian Thrun @SebastianThrun

CEO Udacity, Research professor at Stanford, member of National Academy of Engineering, serial entrepreneur.

31. Jen Lowe @datatelling

gonzo dataist for hire. I connect people + numbers + words. Teaching @SVADSI. Taught @ITP_NYU. Researched @S_I_D_L. Cofounded @sfpc_school. #desertgrown

32. Doug Cutting @cutting

33. Kirk Borne @kirkborne

PhD DataScientist Astrophysicist, Top #BigData Influencer. Passions: #DataScience, #DataMining, Astroinformatics, #CitizenScience http://kirkborne.net

53. Philippe Van Impe @pvanimpe

Social entrepreneur, growing the happiest european startups around Data Science and Analytics, Community builder & marketing expert. #DataScience, #BigData,  #BeTech, #Startups, #CitizenScience http://datasciencebe.com

 

Please add the name of the people who should be mentioned in this list. Who else is missing?