Training – Spark4Devs – by Andy Petrella + Xavier Tordoir from Data Fellas

spark-logo

main page

Woot Woot – The Hub is hosting the Spark4Devs training from Data Fellas

Why you should come to Spark4 Devs

Why now

Nowadays, for developers, one of the rare areas where there is still a lot of things to discover and to do is Big Data, or more precisely, Distributed Engineering.

This is an area where a lot of developers have been around for a while, but this area is also slightly moving towards Distributed Computing on Distributed Datasets.

Not only Machine Learning is important in this area, but certainly, core development capabilities are required to make the production processes Scalable, Highly Available whilst keeping Reliability and Accuracy of the results.

Clearly Scala has its role to play there, Typesafe is spreading the word and were precursors on that, but Data Fellas and BoldRadius Solutions are on the same page. Scala is to be the natural successor of Python, R or even Julia in Data Science, with the introduction of the Distributed facet.

That’s why you, devs but of course data scientists, you want to come to this training on Apache Spark focusing on the Scala language.

Why Spark 4 Devs

Because, the Apache Spark project is THE project you need for all you data project, actually, even if the data is not distributed!

The training is three-days, hence the first day will tackle the core concepts, batch processing. We will start by introducing why Spark is the way to go then we’ll explain how to hit this road by hacking some datasets using it.

The second day will, in the same pragmatic and concrete manner, introduce the Spark Streaming library. Of course, we’ll see how to crack some tweets but also show how a real broker can be consumed, that is, Apache Kafka.

Not only will you be on track to use Apache Spark in you project, but since the Spark Notebook will be used as the main driver for our analyses, you’ll be on track to use interactively and proficiently Apache Spark with a shortened development lifecycle.

Title

Spark 4 Devs

subtitle Discover and learn how to manage Distributed Computing on Distributed Datasets in a Bigdata environment driven by Apache Spark.
 Date  23,24,25 September

Target audience

Mainly developers that want to learn about distributed computing and data processing

Duration

3 days

Location

European Data Innovation Hub @ AXA, Vorstlaan 23, 1170 Brussel

Price

500€ /per day /per trainee

audience

minimum 8 – max 15 people

Registration

via http://spark4devs.data-fellas.guru/

Motivation

http://blog.bythebay.io/post/125621089861/why-you-should-come-to-spark4-devs

Overview

via http://spark4devs.data-fellas.guru/

Learning Objectives

via http://spark4devs.data-fellas.guru/

Topics

Distributed Computing, Spark, Spark Streaming, Kafka, Hands on, Scala

Prerequisites

developing skills in common languages

About the trainer:

Andy

Andy is a mathematician turned into a distributed computing entrepreneur.
Besides being a Scala/Spark trainer. Andy also participated in many projects built using spark, cassandra, and other distributed technologies, in various fields including Geospatial, IoT, Automotive and Smart cities projects.
He is the creator of the Spark Notebook (https://github.com/andypetrella/spark-notebook), the only reactive and fully Scala notebook for Apache Spark.
In 2015, Xavier Tordoir and Andy founded Data Fellas (http://data-fellas.guru) working on:
* the Distributed Data Science toolkit, Shar3
* the Distributed Genomics product, Med@Scale

After completing a Ph.D in experimental atomic physics, Xavier focused on the data processing part of the job, with projects in finance, genomics and software development for academic research. During that time, he worked on timeseries, on prediction of biological molecular structures and interactions, and applied Machine Learning methodologies. He developed solutions to manage and process data distributed across data centers. Since leaving academia a couple of years ago, he provides services and develops products related to data exploitation in distributed computing environments, embracing functional programming, Scala and BigData technologies.

#datascience training programmes

Why don’t you join one of our  #datascience trainings in order to sharpen your skills.

Special rates apply if you are a job seeker.

Here are some training highlights for the coming months:

Check out the full agenda here.

Have you been to our Meetups yet ?

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,328 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

Event – SAPForum – Sept9 @Tour & Taxis, Brussels

Wednesday, Sep 9, 2015, 12:00 AM
2 Attending

Check out this Meetup Group →

Advertisements

Job – Big Industries – Hadoop Developer

Big-Industries-stamp-logo

Matthias Vallaey      Matthias Vallaey, Partner at Big Industries asked us to post following vacancy

Big Industries (a Cronos Company) works together with you to translate your ideas into workable Big Data solutions that will create measurable value for your organisation.
Implementing the solution using proven big data technologies from industry leading vendors, integrating only the most appropriate, effective and sustainable technologies to deliver best-in-class products and services.
Big Industries helps to assess, identify and integrate effective refinements in order to increase the value that big data solutions bring.
We are fulfillment partners for Cloudera and MapR, the premiere Hadoop distributions, for BeLux and offer expert consulting, systems integration and tailored application development with knowledge and experience across a broad range of industries.

Specialties

Hadoop, Big Data, Systems Integration, Consulting, HBase, Spark, MapReduce, SolrCloud, Impala, Kafka

Job Description

As a Big Data Developer you will work in a team building big data solutions. You will be developing, maintaining, testing and evaluating big data solutions within organisations. Generally you will be working on implementing complex and large scale big data projects with a focus on collecting, parsing, managing, analysing and visualizing large datasets to turn raw data into insights using multiple toolsets, techniques and platforms.

Soft skills

Team player – embraces change, able to adapt to working in varied software delivery environments. Can-do attitude, pragmatic, results-oriented – lateral thinker.

Mandatory experience & skills

  • Computing or Mathematics diploma, or 4 years experience active work experience within systems integration teams.
  • Thorough understanding of Java, and solid grasp of software development best practises.
  • Experience using hadoop and related technologies (eg. pig, hive, spark, impala), ideally with popular hadoop data processing pipeline patterns and technologies (cascading, crunch, oozie).
  • Willing to work to become Cloudera Developer certified.
  • Development exposure on both cloud and classic compute environments.
  • Very good Linux systems and Linux shell scripting knowledge.

Apply:

Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Here is the original jobpost .

Contact Matthias Vallaey matthias.vallaey@bigindustries.be (+32 496 57 66 27).