The ABC of Datascience blogs – collaborative update

abc-letters-on-white-sandra-cunningham

A – ACID – Atomicity, Consistency, Isolation and Durability

B – Big Data – Volume, Velocity, Variety

C – Columnar (or Column-Oriented) Database

  • CoolData By Kevin MacDonell on Analytics, predictive modeling and related cool data stuff for fund-raising in higher education.
  • Cloud of data blog By Paul Miller, aims to help clients understand the implications of taking data and more to the Cloud.
  • Calculated Risk, Finance and Economics

D – Data Warehousing – Relevant and very useful

E – ETL – Extract, transform and load

F – Flume – A framework for populating Hadoop with data

  • Facebook Data Science Blog, the official blog of interesting insights presented by Facebook data scientists.
  • FiveThirtyEight, by Nate Silver and his team, gives a statistical view of everything from politics to science to sports with the help of graphs and pie charts.
  • Freakonometrics Charpentier, a professor of mathematics, offers a nice mix of generally accessible and more challenging posts on statistics related subjects, all with a good sense of humor.
  • Freakonomics blog, by Steven Levitt and Stephen J. Dubner.
  • FastML, covering practical applications of machine learning and data science.
  • FlowingData, the visualization and statistics site of Nathan Yau.

G – Geospatial Analysis – A picture worth 1,000 words or more

H – Hadoop, HDFS, HBASE

  • Harvard Data Science, thoughts on Statistical Computing and Visualization.
  • Hyndsight by Rob Hyndman, on fore­cast­ing, data visu­al­iza­tion and func­tional data.

I – In-Memory Database – A new definition of superfast access

  • IBM Big Data Hub Blogs, blogs from IBM thought leaders.
  • Insight Data Science Blog on latest trends and topics in data science by Alumnus of Insight Data Science Fellows Program.
  • Information is Beautiful, by Independent data journalist and information designer David McCandless who is also the author of his book ‘Information is Beautiful’.
  • Information Aesthetics designed and maintained by Andrew Vande Moere, an Associate Professor at KU Leuven university, Belgium. It explores the symbiotic relationship between creative design and the field of information visualization.
  • Inductio ex Machina by Mark Reid’s research blog on machine learning & statistics.

J – Java – Hadoop gave it a nice push

  • Jonathan Manton’s blog by Jonathan Manton, Tutorial-style articles in the general areas of mathematics, electrical engineering and neuroscience.
  • JT on EDM, James Taylor on Everything Decision Management
  • Justin Domke blog, on machine learning and computer vision, particularly probabilistic graphical models.
  • Juice Analytics on analytics and visualization.

K – Kafka – High-throughput, distributed messaging system originally developed at LinkedIn

L – Latency – Low Latency and High Latency

  • Love Stats Blog By Annie, a market research methodologist who blogs about sampling, surveys, statistics, charts, and more
  • Learning Lover on programming, algorithms with some flashcards for learning.
  • Large Scale ML & other Animals, by Danny Bickson, started the GraphLab, an award winning large scale open source project

M – Map/Reduce – MapReduce

N – NoSQL Databases – No SQL Database or Not Only SQL

O – Oozie – Open-source workflow engine managing Hadoop job processing

  • Occam’s Razor by Avinash Kaushik, examining web analytics and Digital Marketing.
  • OpenGardens, Data Science for Internet of Things (IoT), by Ajit Jaokar.
  • O’reilly Radar O’Reilly Radar, a wide range of research topics and books.
  • Oracle Data Mining Blog, Everything about Oracle Data Mining – News, Technical Information, Opinions, Tips & Tricks. All in One Place.
  • Observational Epidemiology A college professor and a statistical consultant offer their comments, observations and thoughts on applied statistics, higher education and epidemiology.
  • Overcoming bias By Robin Hanson and Eliezer Yudkowsky. Present Statistical analysis in reflections on honesty, signaling, disagreement, forecasting and the far future.

P – Pig – Platform for analyzing huge data sets

  • Probability & Statistics Blog By Matt Asher, statistics grad student at the University of Toronto. Check out Asher’s Statistics Manifesto.
  • Perpetual Enigma by Prateek Joshi, a computer vision enthusiast writes question-style compelling story reads on machine learning.
  • PracticalLearning by Diego Marinho de Oliveira on Machine Learning, Data Science and Big Data.
  • Predictive Analytics World blog, by Eric Siegel, founder of Predictive Analytics World and Text Analytics World, and Executive Editor of the Predictive Analytics Times, makes the how and why of predictive analytics understandable and captivating.

Q – Quantitative Data Analysis

R – Relational Database – Still relevant and will be for some time

  • R-bloggers , best blogs from the rich community of R, with code, examples, and visualizations
  • R chart A blog about the R language written by a web application/database developer.
  • R Statistics By Tal Galili, a PhD student in Statistics at the Tel Aviv University who also works as a teaching assistant for several statistics courses in the university.
  • Revolution Analytics hosted, and maintained by Revolution Analytics.
  • Rick Sherman: The Data Doghouse on business and technology of performance management, business intelligence and datawarehousing.
  • Random Ponderings by Yisong Yue, on artificial intelligence, machine learning & statistics.

S – Sharding (Database Partitioning)  and Sqoop (SQL Database to Hadoop)

  • Salford Systems Data Mining and Predictive Analytics Blog, by Dan Steinberg.
  • Sabermetric Research By Phil Burnbaum blogs about statistics in baseball, the stock market, sports predictors and a variety of subjects.
  • Statisfaction A blog by jointly written by PhD students and post-docs from Paris (Université Paris-Dauphine, CREST). Mainly tips and tricks useful in everyday jobs, links to various interesting pages, articles, seminars, etc.
  • Statistically Funny True to its name, epidemiologist Hilda Bastian’s blog is a hilarious account of the science of unbiased health research with the added bonus of cartoons.
  • SAS Analysis, a weekly technical blog about data analysis in SAS.
  • SAS blog on text mining on text mining, voice mining and unstructured data by SAS experts.
  • SAS Programming for Data Mining Applications, by LX, Senior Statistician in Hartford, CT.
  • Shape of Data, presents an intuitive introduction to data analysis algorithms from the perspective of geometry, by Jesse Johnson.
  • Simply Statistics By three biostatistics professors (Jeff Leek, Roger Peng, and Rafa Irizarry) who are fired up about the new era where data are abundant and statisticians are scientists.
  • Smart Data Collective, an aggregation of blogs from many interesting data science people
  • Statistical Modeling, Causal Inference, and Social Science by Andrew Gelman
  • Stats with Cats By Charlie Kufs has been crunching numbers for over thirty years, first as a hydrogeologist and since the 1990s, as a statistician. His tagline is- when you can’t solve life’s problems with statistics alone.
  • StatsBlog, a blog aggregator focused on statistics-related content, and syndicates posts from contributing blogs via RSS feeds.
  • Steve Miller BI blog, at Information management.

T – Text Analysis – Larger the information, more needed analysis

U – Unstructured Data – Growing faster than speed of thoughts

V – Visualization – Important to keep the information relevant

  • Vincent Granville blog. Vincent, the founder of AnalyticBridge and Data Science Central, regularly posts interesting topics on Data Science and Data Mining

W – Whirr – Big Data Cloud Services i.e. Hadoop distributions by cloud vendors

X – XML – Still eXtensible and no Introduction needed

  • Xi’an’s Og Blog A blog written by a professor of Statistics at Université Paris Dauphine, mainly centred on computational and Bayesian topics.

Y – Yottabyte – Equal to 1,000 exabytes, 1 million petabytes and 1 billion terabytes

Z – Zookeeper – Help managing Hadoop nodes across a distributed network

Feel free to add your preferred blog in the comment bellow.

Other resources:

Nice video channels:

More Jobs ?

hidden-jobs1

Click here for more Data related job offers.
Join our community on linkedin and attend our meetups.
Follow our twitter account: @datajobsbe

Improve your skills:

Why don’t you join one of our  #datascience trainings in order to sharpen your skills.

Special rates apply if you are a job seeker.

Here are some training highlights for the coming months:

Check out the full agenda here.

Join the experts at our Meetups:

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,417 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

DATA UNIFICATION IN CORPORATE ENVIRONMENTS

Wednesday, Oct 14, 2015, 6:30 PM
57 Attending

Check out this Meetup Group →

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s