This training consists of a hands on session for introducing you to both Scala and Apache Spark for big data processing. In the first part, we will introduce you to Scala, handling concepts of object-oriented and functional languages, which make it particularly suitable for writing concurrent, distributed systems. Secondly, we will introduce you to Spark, which is a cluster-computing platform, revolutionizing large-scale data processing, written in Scala.
- Introduction to Scala
- Elements of functional programming
- Elements of object-oriented programming
- Introduction to Akka actor model for concurrency and parallelism
- Introduction to Apache Spark
- The big data challenge
- Limitation of Hadoop
- Components of Spark
- Hands on session on Spark Scala API
- Prior knowledge of at least one programming language is advised but not required.
- Bring your own laptop (Windows, Linux or Mac) and install:
- JDK, the java development kit, version 1.7 from http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
- Sbt, a build tool for Scala, version 0.13.0 or higher from http://www.scala-sbt.org/download.html
- The Scala IDE for Eclipse and the Scala Worksheet (version 2.11 or higher) http://scala-ide.org/download/sdk.html
Holding a civil engineering degree in applied mathematics, Thomas Ghyselen has more than 8 years of international experience, in the consulting and services industry. Combining the ideal mix between business knowledge, mathematics and programming skills, Thomas is strongly focussing on data science and large scale data processing to drive business decisions. He’s currently working as a data scientist for Ordina (www.ordina.be), helping companies with problems like stock optimization, forecasting, stochastic algorithms and specializing in Spark for cluster computing.
- Start at 9:00 AM.
- September, 23, 2015
- Duration: 6h
- Location: European Data Innovation Hub @ AXA, Vorstlaan 23, 1170 Brussel
- Price: 300€