Hilary Mason one of the shining lights of the world of data science Tweeted recently
‘Data people: What is the very first thing you do when you get your hands on a new data set?’
What I do when I get a new dataset is a recent article on the Simple Statistics blog, is a response to this.
I’ve been thinking about my own data science process. My academic background is in Physics and Mathematics, so I am influenced by those disciplines. This is a personal blog post, just to document my own Data Science Process.
0) Try to understand the data set: I must admit there have been projects that I’ve forgotten this during. I’ve been so eager to apply ‘Cool algorithm X’ or ‘Cool model Y’ to a problem that I’ve forgotten that Exploratory Data Analysis – which always strikes me as low tech is extremely valuable…
View original post 548 more words