Woot Woot – The Hub is hosting the Spark4Devs training from Data Fellas
Why you should come to Spark4 Devs
Nowadays, for developers, one of the rare areas where there is still a lot of things to discover and to do is Big Data, or more precisely, Distributed Engineering.
This is an area where a lot of developers have been around for a while, but this area is also slightly moving towards Distributed Computing on Distributed Datasets.
Not only Machine Learning is important in this area, but certainly, core development capabilities are required to make the production processes Scalable, Highly Available whilst keeping Reliability and Accuracy of the results.
Clearly Scala has its role to play there, Typesafe is spreading the word and were precursors on that, but Data Fellas and BoldRadius Solutions are on the same page. Scala is to be the natural successor of Python, R or even Julia in Data Science, with the introduction of the Distributed facet.
That’s why you, devs but of course data scientists, you want to come to this training on Apache Spark focusing on the Scala language.
Why Spark 4 Devs
Because, the Apache Spark project is THE project you need for all you data project, actually, even if the data is not distributed!
The training is three-days, hence the first day will tackle the core concepts, batch processing. We will start by introducing why Spark is the way to go then we’ll explain how to hit this road by hacking some datasets using it.
The second day will, in the same pragmatic and concrete manner, introduce the Spark Streaming library. Of course, we’ll see how to crack some tweets but also show how a real broker can be consumed, that is, Apache Kafka.
Not only will you be on track to use Apache Spark in you project, but since the Spark Notebook will be used as the main driver for our analyses, you’ll be on track to use interactively and proficiently Apache Spark with a shortened development lifecycle.
About the trainer:
Andy is a mathematician turned into a distributed computing entrepreneur.
Besides being a Scala/Spark trainer. Andy also participated in many projects built using spark, cassandra, and other distributed technologies, in various fields including Geospatial, IoT, Automotive and Smart cities projects.
He is the creator of the Spark Notebook (https://github.com/andypetrella/spark-notebook), the only reactive and fully Scala notebook for Apache Spark.
In 2015, Xavier Tordoir and Andy founded Data Fellas (http://data-fellas.guru) working on:
* the Distributed Data Science toolkit, Shar3
* the Distributed Genomics product, Med@Scale
After completing a Ph.D in experimental atomic physics, Xavier focused on the data processing part of the job, with projects in finance, genomics and software development for academic research. During that time, he worked on timeseries, on prediction of biological molecular structures and interactions, and applied Machine Learning methodologies. He developed solutions to manage and process data distributed across data centers. Since leaving academia a couple of years ago, he provides services and develops products related to data exploitation in distributed computing environments, embracing functional programming, Scala and BigData technologies.
#datascience training programmes
Why don’t you join one of our #datascience trainings in order to sharpen your skills.
Special rates apply if you are a job seeker.
Here are some training highlights for the coming months:
Check out the full agenda here.
Have you been to our Meetups yet ?
Each month we organize a Meetup in Brussels focused on a specific DataScience topic.