The Brussels Data Science Community is pleased to announce a special event on “Data Unification”, which includes the essential first step, “Data Wrangling”. The term “Data Wrangling”describes the time-consuming tasks of data collection and preparation, and discovery of data sources and metadata, before the data can be extracted, loaded and analyzed. It isestimated that “Data Wrangling” consumes over 70% of Data Scientists’ time, with activities that are certainly “unsexy”. Most Data Scientists are far more passionate about doing advanced and predictive analytics on the data, rather than spending their time on tedious data preparation tasks.
There will be presentations by two leading local companies, AXA and Toyota Europe. They will talk about their data management environments, their usage of advanced techniques and tools in this area, and explain how they would like to evolve, to support new initiatives in big data and/or data science – to support business objectives.
There will also be a special introduction – overview and demo – of “Tamr Catalog”, a new open-source software tool from Tamr (www.tamr.com). Tamr is an initiative coming out of the “CSAIL” (Computer Science and Artificial Intelligence Lab) at MIT (Massachusetts Institute of Technology), led by Dr. Michael Stonebraker (2014 Turing Award Winner, and noted Database inventor [Ingres, Postgres, Vertica, VoltDB, Tamr, and others]).
Presentation 1: Overview of the AXA “Data Pipeline”
(Mr. Fabien Janssens, AXA)
Mr. Fabien Janssens, Solution Design Architect and Data Scientist at AXA, will provide an overview of the “Data Pipeline” within AXA, and the important issues and considerations for this in the AXA environment. Fabien has been working extensively with new tools, such as
Fabien completed his Masters Degree in Computer Science at the ULB, and has also worked on some advanced medical projects in the Belgian Congo.
Presentation 2: Introduction to Tamr Catalog (new, open-source software)
(Mr. Ted Gudmundsen, Tamr) Mr. Ted Gudmundsen, Field Engineer from Tamr will provide an overview and demonstration of the new Tamr Catalog, and answer questions about Tamr and its software.
Ted Gudmundsen has been involved in advanced data processing for over 7 years, first as a researcher at MIT Lincoln Laboratory working on quantum computing, and now working on machine learning and data integration at Tamr. Ted has a M.S. in physics with a focus on nano-
fabrication from Cornell University, and a B.A. in Physics from Princeton University.
PIZZA BREAK (Around 8 PM)
Presentation 3: Data Architecture and Advanced projects at Toyota Europe (Dr. Wouter Dullaert, Toyota Europe)
Dr. Wouter Dullaert, an IT Architect at Toyota Europe, will speak about the advanced data management projects underway in their international environment. Wouter is (co-)responsible for the architecture of retailer facing systems. This includes projects such as retailer digitalization, lead management systems, single view of customer and the car configurator.
Wouter has a PhD in Electrical Engineering and Masters degrees in Industrial Engineering and Electrical Engineering, all from the Universiteit Gent.