A – ACID – Atomicity, Consistency, Isolation and Durability
- Adventures in Analytics and Visualization, blog by Vivek Patil.
- All Things Data Science, Istvan Hajnal’s musings on Data Science, Big Data, Market Research and Data journalism
- AnalyticBridge by Vincent Granville, about advanced analytics
- Analytics Vidhya blog on development of analytical skills, analytic industry best practices, and more.
- Ann Maria’s Blog, by Dr. AnnMaria De Mars, President of the online statistics education company The Julia Group.
- Anil Batra’s Web Analysis (Analytics), Online Advertising and Behavioral Targeting blog
B – Big Data – Volume, Velocity, Variety
- Business. Statistics. Technology, by Galit Shmueli, Professor of Statistics at Indian School of Business, Hyderabad, India.
- Beyond the Box Score A blog using statistics to analyse the game of baseball.
- Blog About Stats By Armin Grossenbacher, a network for professionals mainly of statistical institutions.
- Brussels Data Science Community by Philippe Van Impe about European datascience topics
C – Columnar (or Column-Oriented) Database
- CoolData By Kevin MacDonell on Analytics, predictive modeling and related cool data stuff for fund-raising in higher education.
- Cloud of data blog By Paul Miller, aims to help clients understand the implications of taking data and more to the Cloud.
- Calculated Risk, Finance and Economics
D – Data Warehousing – Relevant and very useful
- DataMiningApps by Bart Baesens
- Data Mining Research blog by Sandro Saitta on data mining research issues, recent applications, important events, interviews with leading actors, current trends, book reviews, etc
- Data Science 101 by Ryan Swanstrom on becoming a data scientist.
- Data Science London on latest trends and research in data science.
- Data Science Renee, by Renee M. P. Teate, on becoming a data scientist.
- Data Tau a list of interesting articles submitted by readers.
- DecisionStats by Ajay Ohri, founder of DECISIONSTATS and author of “R for Business Analytics” and “R for Cloud Computing”.
- DiffusePrior by Alan Fernihough, on using R in econometric research.
- Domino Data Lab on startups, data science, R and Python.
- Data Scientist Journey Open source data science masters
- Data Genetics
- Deep Data Mining Blog, mostly focused on technical aspect of data mining, by Jay Zhou.
E – ETL – Extract, transform and load
- Edwin Chen’s by Edwin Chen, writes about math, machine learning, and data science.
- EMC Big data blog, by Mona Patel, big data solutions marketing at EMC
- Error Statistics Philosophy by Virginia Tech statistical philosopher Deborah G. Mayo
F – Flume – A framework for populating Hadoop with data
- Facebook Data Science Blog, the official blog of interesting insights presented by Facebook data scientists.
- FiveThirtyEight, by Nate Silver and his team, gives a statistical view of everything from politics to science to sports with the help of graphs and pie charts.
- Freakonometrics Charpentier, a professor of mathematics, offers a nice mix of generally accessible and more challenging posts on statistics related subjects, all with a good sense of humor.
- Freakonomics blog, by Steven Levitt and Stephen J. Dubner.
- FastML, covering practical applications of machine learning and data science.
- FlowingData, the visualization and statistics site of Nathan Yau.
G – Geospatial Analysis – A picture worth 1,000 words or more
- Geeking with Greg, exploring the future of personalized information.
H – Hadoop, HDFS, HBASE
- Harvard Data Science, thoughts on Statistical Computing and Visualization.
- Hyndsight by Rob Hyndman, on forecasting, data visualization and functional data.
I – In-Memory Database – A new definition of superfast access
- IBM Big Data Hub Blogs, blogs from IBM thought leaders.
- Insight Data Science Blog on latest trends and topics in data science by Alumnus of Insight Data Science Fellows Program.
- Information is Beautiful, by Independent data journalist and information designer David McCandless who is also the author of his book ‘Information is Beautiful’.
- Information Aesthetics designed and maintained by Andrew Vande Moere, an Associate Professor at KU Leuven university, Belgium. It explores the symbiotic relationship between creative design and the field of information visualization.
- Inductio ex Machina by Mark Reid’s research blog on machine learning & statistics.
J – Java – Hadoop gave it a nice push
- Jonathan Manton’s blog by Jonathan Manton, Tutorial-style articles in the general areas of mathematics, electrical engineering and neuroscience.
- JT on EDM, James Taylor on Everything Decision Management
- Justin Domke blog, on machine learning and computer vision, particularly probabilistic graphical models.
- Juice Analytics on analytics and visualization.
K – Kafka – High-throughput, distributed messaging system originally developed at LinkedIn
- Kaggle blog “No Free Hunch”, covering Kaggle data science and machine learning competitions
L – Latency – Low Latency and High Latency
- Love Stats Blog By Annie, a market research methodologist who blogs about sampling, surveys, statistics, charts, and more
- Learning Lover on programming, algorithms with some flashcards for learning.
- Large Scale ML & other Animals, by Danny Bickson, started the GraphLab, an award winning large scale open source project
M – Map/Reduce – MapReduce
- MarkTab Data Mining blog.
- MineThatData Blog by Kevin Hillstrom, views on Multichannel Marketing and Database Marketing.
- MDMgeek Blog by Prashant Chandramohan, on data management.
- Magnus Notitia, by Tevfik Kosar on Big data and beyond, data intensive scientific thought.
- Meta Analysis with a wide range of categories to explore! Analytics You Can Take to the Bank – Meta S. Brown on Predictive Analytics
- Mydatamine.com, a compilation of links and news for Data Mining geeks.
- Machine Learning Mastery by Jason Brownlee, on programming & machine learning.
- Machined Learning by Paul Mineiro, from Microsoft Cloud & Information Services Lab
N – NoSQL Databases – No SQL Database or Not Only SQL
- Nuit Blanche by Igor Carron, focuses on Compressive Sensing, Advanced Matrix Factorization Techniques, Machine Learning.
- Numbers rule your world, by Kaiser Fung
O – Oozie – Open-source workflow engine managing Hadoop job processing
- Occam’s Razor by Avinash Kaushik, examining web analytics and Digital Marketing.
- OpenGardens, Data Science for Internet of Things (IoT), by Ajit Jaokar.
- O’reilly Radar O’Reilly Radar, a wide range of research topics and books.
- Oracle Data Mining Blog, Everything about Oracle Data Mining – News, Technical Information, Opinions, Tips & Tricks. All in One Place.
- Observational Epidemiology A college professor and a statistical consultant offer their comments, observations and thoughts on applied statistics, higher education and epidemiology.
- Overcoming bias By Robin Hanson and Eliezer Yudkowsky. Present Statistical analysis in reflections on honesty, signaling, disagreement, forecasting and the far future.
P – Pig – Platform for analyzing huge data sets
- Probability & Statistics Blog By Matt Asher, statistics grad student at the University of Toronto. Check out Asher’s Statistics Manifesto.
- Perpetual Enigma by Prateek Joshi, a computer vision enthusiast writes question-style compelling story reads on machine learning.
- PracticalLearning by Diego Marinho de Oliveira on Machine Learning, Data Science and Big Data.
- Predictive Analytics World blog, by Eric Siegel, founder of Predictive Analytics World and Text Analytics World, and Executive Editor of the Predictive Analytics Times, makes the how and why of predictive analytics understandable and captivating.
Q – Quantitative Data Analysis
R – Relational Database – Still relevant and will be for some time
- R-bloggers , best blogs from the rich community of R, with code, examples, and visualizations
- R chart A blog about the R language written by a web application/database developer.
- R Statistics By Tal Galili, a PhD student in Statistics at the Tel Aviv University who also works as a teaching assistant for several statistics courses in the university.
- Revolution Analytics hosted, and maintained by Revolution Analytics.
- Rick Sherman: The Data Doghouse on business and technology of performance management, business intelligence and datawarehousing.
- Random Ponderings by Yisong Yue, on artificial intelligence, machine learning & statistics.
S – Sharding (Database Partitioning) and Sqoop (SQL Database to Hadoop)
- Salford Systems Data Mining and Predictive Analytics Blog, by Dan Steinberg.
- Sabermetric Research By Phil Burnbaum blogs about statistics in baseball, the stock market, sports predictors and a variety of subjects.
- Statisfaction A blog by jointly written by PhD students and post-docs from Paris (Université Paris-Dauphine, CREST). Mainly tips and tricks useful in everyday jobs, links to various interesting pages, articles, seminars, etc.
- Statistically Funny True to its name, epidemiologist Hilda Bastian’s blog is a hilarious account of the science of unbiased health research with the added bonus of cartoons.
- SAS Analysis, a weekly technical blog about data analysis in SAS.
- SAS blog on text mining on text mining, voice mining and unstructured data by SAS experts.
- SAS Programming for Data Mining Applications, by LX, Senior Statistician in Hartford, CT.
- Shape of Data, presents an intuitive introduction to data analysis algorithms from the perspective of geometry, by Jesse Johnson.
- Simply Statistics By three biostatistics professors (Jeff Leek, Roger Peng, and Rafa Irizarry) who are fired up about the new era where data are abundant and statisticians are scientists.
- Smart Data Collective, an aggregation of blogs from many interesting data science people
- Statistical Modeling, Causal Inference, and Social Science by Andrew Gelman
- Stats with Cats By Charlie Kufs has been crunching numbers for over thirty years, first as a hydrogeologist and since the 1990s, as a statistician. His tagline is- when you can’t solve life’s problems with statistics alone.
- StatsBlog, a blog aggregator focused on statistics-related content, and syndicates posts from contributing blogs via RSS feeds.
- Steve Miller BI blog, at Information management.
T – Text Analysis – Larger the information, more needed analysis
- Tatvic blog, covering web analytics, R, Google analytics, and related topics.
- The Mainstream Seer, by Venky Rao.
- The Geomblog by Suresh
- The Official Google Analytics Blog.
- Unofficial Google Analytics Blog from ROI Revolution.
- The New Data Scientist Blog on How a Social Scientist Jumps into the World of Big Data.
- The Analysis Factor By Karen Grace Martin
- The Guardian, Data Blog By Gaurdian, on news related topics and their analysis based on data.
- The Bad ScienceBy Dr. Ben Goldacre, an epidemiologist who uses statistics to debunk bad science.
- The Practical Quant By Ben Lorica, O’Reilly Media Chief Data Scientist, on OLAP analytics, big data, data applications etc
- The Corner Office Blog By John Sall & SAS executives who post their thoughts on global business, analytics and technology
- The John D. Cook blog John D. Cook is a math professor and consultant who blogs about statistics, analyzing data, problem solving and integrating solution components.
- The numbers guy A blog written by Wall Street Journal writer Carl Bialik who ‘examines the way numbers are used, and abused’
- Three Toed Sloth A blog written by Professor Cosma Shalizi who teaches statistics at Carnegie Mellon University.
U – Unstructured Data – Growing faster than speed of thoughts
V – Visualization – Important to keep the information relevant
- Vincent Granville blog. Vincent, the founder of AnalyticBridge and Data Science Central, regularly posts interesting topics on Data Science and Data Mining
W – Whirr – Big Data Cloud Services i.e. Hadoop distributions by cloud vendors
- What’s the Big Data, by Gil Press. Gil covers the Big Data space and also writes a column on Big Data and Business in Forbes.
- Walking Randomly by Mike Croucher
- World of analytics by Gracy Poelman
- University of Wisconsin Data Science Blog – Data science blog for news and resources about data science, including interviews and updates about the University of Wisconsin’s program.
X – XML – Still eXtensible and no Introduction needed
- Xi’an’s Og Blog A blog written by a professor of Statistics at Université Paris Dauphine, mainly centred on computational and Bayesian topics.
Y – Yottabyte – Equal to 1,000 exabytes, 1 million petabytes and 1 billion terabytes
Z – Zookeeper – Help managing Hadoop nodes across a distributed network
Feel free to add your preferred blog in the comment bellow.
- Top 50 Data Science Resources: The Best Blogs, Forums, Videos and Tutorials to Learn All about Data Science
- A curated list of data science blogs on Github
- 85 active (recently updated) data mining, data science, and machine learning blogs by Qian Wang
- Data Science Top Articles
Nice video channels:
More Jobs ?
Improve your skills:
Why don’t you join one of our #datascience trainings in order to sharpen your skills.
Special rates apply if you are a job seeker.
Here are some training highlights for the coming months:
- Jan Wijffels has scheduled a full R course in January.
- Geert Verstaeten is repeating his executive class on predictive analytics on Dec 3rd.
- Rik Van Bruggen is organizing a full day of Graphs training with Neo4j.
- Elie Jesuran is organizing a workshop about building an intelligent twitterbot.
- Andy Petrella is organizing a 3 day Apache Spark course.
- Fabien Janssens is doing a hands-on workshop on Elasticsearch.
- Frank Vanden Berghen is will explain the ins and outs of Timi and Anatella in October.
Check out the full agenda here.
Join the experts at our Meetups:
Each month we organize a Meetup in Brussels focused on a specific DataScience topic.