Job – Junior Data Scientist

Screenshot 2016-07-01 12.02.02

Are you pursuing a career in data science?

We have a great opportunity for you: an intensive training program combined with interesting job opportunities!

Interested? Check out http://di-academy.com/bootcamp/ follow the link to our datascience survey and send your cv to training@di-academy.com

Once selected, you’ll be invited for the intake event that will take place in Brussels this summer.

Hope to see you there,

Nele & Philippe

Training – Hands-on with SparkR – Brussels – November 24

sparkr_custom_logo

As of June 2015 SparkR is integrated in Spark-1.4.0. However this is still work in progress: in the original version, no Spark MLlib machine learning algorithms were accessible via R. In Spark-1.5.0 it is already possible to create generalized linear models (glm).

In this one-day SparkR course, you will understand how Spark is working under the hood (MapReduce paradigm, lazy evaluation, …) and learn how to use SparkR. You will start setting up a local Spark cluster and access it via R. Next up you will learn basic data transformations in SparkR, either via R code or via SparkSql. Finally we will use SparkR’s glm and compare it to R’s glm and we will implement our own machine learning algorithm.

logoThis training event is organise in collaboration with Oak3 (http://www.oak3.be). The Oak3 Academy is an IT Learning Center providing hands-on, intensive training and coaching to help students develop the skills they need for a successful career as an Information Technology Professional or as a knowledge worker (end-user of software). Our goal is to provide the highest quality training and knowledge transfer that enables a person to start or enhance his or her career as an IT professional or knowledge worker, in a short period of time. We therefore offer knowledge assimilation, facilitate expertise transfer and provide a rewarding learning experience. Our training solutions are designed to help students learn faster, master the latest information technologies and perform smarter.

Prerequisites: Previous experience with R is required, notions of Apache Spark are useful but not required.

When: Tuesday, November 24, 2015 from 9:00 AM to 5:00 PM (CET)

Where: European Data Innovation Hub – 23 Vorstlaan Watermaal-Bosvoorde, Brussel 1170 BE

Registration: Eventbrite

Training – Hands-on – Spark Streaming – Brussels – December 1st

spark-streaming-logo_0

In this one-day Spark Streaming course you will learn setting up your very own Spark Streaming applications, and do real-time data processing and analytics. You will start with setting up the data ingestion from HDFS, Kafka, and even Twitter.

Next up you will learn about the benefits from using one integrated framework for both batch and streaming processing. You will combine streaming and historical data, in order to create valuable applications.

You will learn how fault-tolerance is built into Spark Streaming, and might even get a hint of how to combine it with the Spark MLlib machine learning library.

This training evelogont is organised in collaboraton with Oak3 (http://www.oak3.be). The Oak3 Academy is an IT Learning Center providing hands-on, intensive training and coaching to help students develop the skills they need for a successful career as an Information Technology Professional or as a knowledge worker (end-user of software). Our goal is to provide the highest quality training and knowledge transfer that enables a person to start or enhance his or her career as an IT professional or knowledge worker, in a short period of time. We therefore offer knowledge assimilation, facilitate expertise transfer and provide a rewarding learning experience. Our training solutions are designed to help students learn faster, master the latest information technologies and perform smarter.

Prerequisites: Previous experience with programming for Apache Spark in Scala is required.

When: 1 December 2015 from 9:00 AM to 5:00 PM (CET)

Where: European Data Innovation Hub – 23 Vorstlaan Watermaal-Bosvoorde, Brussel 1170
BE

Registration: through Eventbrite

Job – Sentiance – Marketing Data Scientist – Antwerp

sentiance_logo_72dpi

Hi Philippe,

Given the topic of the meetup next Thursday, I think the following job opportunity might be relevant to post on your blog 🙂
At Sentiance we’re looking for a data scientist with experience in market segmentation:
http://www.sentiance.com/team/marketing-data-scientist/
However, we always welcome applications of junior candidates too!
http://www.sentiance.com/team/junior-data-scientist/

Thanks, and hope to see you thursday!
Vincent Spruyt
twitter id: @sentiance

As an experienced data analyst, you are ready to kick-off a new adventure in a fast-paced environment where you can work with the latest machine learning technologies and data science tools.

Job description

  1. You will be part of our Data Science Team and you are passionate about machine learning and data analysis.
  2. Using advanced data analytics, you will form hypotheses and draw meaningful insights about user behavior and user segmentation. As a marketing data scientist, you will explore relations between users and their preferences, discover interesting segments, perform advanced clustering and dimensionality reduction techniques.
  3. You will carry out research that will improve our general understanding of our users, and communicate your findings to other team members in order to initiate new platform development cycles.
  4. You will apply your statistical and mathematical background to real-life big-data problems, and use your machine learning knowledge on a day to day basis.
  5. You will work closely & interact with our Data Engineering Team as your work is used to improve our models and is pushed through our release process.
  6. Your main objectives will be the design and implementation of data mining and analysis algorithms and the communication of reports and quality metrics for current production processes.

Desired Skills & Experience:

  1. You have a masters degree or PhD in computer science or related field.
  2. You are an expert in advanced analytics and are experienced in hypothesis testing.
  3. You possess a deep understanding of clustering, manifold learning and predictive modeling techniques.
  4. You have good knowledge of and experience with any of Python, Matlab or R.
  5. You have a strong mathematical background and analytical mindset.
  6. You are fluent in English. Dutch is a plus.
  7. You can work independently and take matters into your own hands.
  8. The ability to quickly learn new technologies and successfully implement them is essential.

Bonus

Experience with any of the following is considered a plus:

  • Advanced Python knowledge and experience
  • Scikit-learn, Pandas, Numpy, Matplotlib
  • Experience with Spark or the Hadoop eco-system
  • Machine learning, data mining, data visualization

Apply:

Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Please note that we also manage other vacancies that are not public, if you want us to bring you in contact with them too, just send your CV to datasciencebe@gmail.com .

Send your job application today! 

Please send Sentiance your resume and a strong motivation with reference sentiance/2015/MDS or apply on LinkedIn.

Job – Infofarm – Big Data Developer

Infofarm

InfoFarm breidt uit en is op zoek naar een nieuwe Big Data Developer!

BEDRIJFSPROFIEL

InfoFarm is een Data Science bedrijf dat zich toespitst in het opleveren van kwaliteitsvolle Data Science en Big Data oplossingen aan haar klanten. Onze naam danken we aan één van de vele informele brainstormsessies onder collega’s die spontaan tijdens de middagpauze ontstaan. Een gezellige sessie later hadden we de hele analogie met het boerderijleven op poten: we planten ideeën, we ploegen door onze klant zijn data, laten deze groeien met andere data of inzichten en oogsten business waarde door er verschillende (machine learning) technieken op toe te passen.

We hebben een uniek team met verscheidene talenten en verschillende achtergronden: Data Scientists (mensen met een onderzoek achtergrond uit een kwantitatieve richting, Big Data Developers (sterk technische Java programmeurs) en Infrastructuurmensen (de bits-and-bytes mensen). Wij ontwikkelen samen geweldige oplossingen voor onze klanten uit verschillende sectoren. Om ons team te versterken zijn we op zoek naar een Big Data Developer. 

FUNCTIEOMSCHRIJVING

Als Big Data Developer ontwikkel je voornamelijk Big Data applicaties op het Apache Hadoop of Apache Spark platform. Je werkt zelfstandig of in een gemengd team, ofwel in onze kantoren ofwel in detachering bij de klant. Je bent niet bang om met creatieve oplossingen voor complexe problemen naar voren te treden. De ene dag werk je voor een telecom bedrijf, om de dag nadien het waterzuivering systeem van België beter te leren kennen en ten slotte ook nog een Big Data applicatie in de logistieke sector te bouwen. Bij InfoFarm zijn geen twee projecten gelijkaardig, maar dat schrikt je niet af. Je kijkt er naar uit om bij te leren over verschillende businessen en om nieuwe ontwikkelingen en technologieën op de markt te volgen, alsook om  deze opgedane kennis uit te dragen naar onze klanten en binnen het team. 

FUNCTIEVEREISTEN

Je hebt minstens 2-3 jaar ervaring met Java ontwikkeling. Certificaties vormen een meerwaarde.

Je kan werken met Maven, Spring of EJB en één of meer RDBMS.

Kennis van Hadoop, Hive en Pig zijn een pluspunt, net als kennis van Spark en Spark MLlib. Bereidheid om je te certifiëren in een van deze domeinen is noodzakelijk.

Kennis van R en Scala zijn een voordeel.

Je hebt op zijn minst een Bachelor in Applied Computer Sciences. 

Apply:

Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Please note that we also manage other vacancies that are not public, if you want us to bring you in contact with them too, just send your CV to datasciencebe@gmail.com .

Bekijk de volledige job informatie hieronder en stuur als antwoord je CV naar jobs@infofarm.be!

(An English version can be requested via jobs@infofarm.be)

check out the original post: http://www.infofarm.be/articles/were-hiring-big-data-developer-0

Training – Spark4Devs – by Andy Petrella + Xavier Tordoir from Data Fellas

spark-logo

main page

Woot Woot – The Hub is hosting the Spark4Devs training from Data Fellas

Why you should come to Spark4 Devs

Why now

Nowadays, for developers, one of the rare areas where there is still a lot of things to discover and to do is Big Data, or more precisely, Distributed Engineering.

This is an area where a lot of developers have been around for a while, but this area is also slightly moving towards Distributed Computing on Distributed Datasets.

Not only Machine Learning is important in this area, but certainly, core development capabilities are required to make the production processes Scalable, Highly Available whilst keeping Reliability and Accuracy of the results.

Clearly Scala has its role to play there, Typesafe is spreading the word and were precursors on that, but Data Fellas and BoldRadius Solutions are on the same page. Scala is to be the natural successor of Python, R or even Julia in Data Science, with the introduction of the Distributed facet.

That’s why you, devs but of course data scientists, you want to come to this training on Apache Spark focusing on the Scala language.

Why Spark 4 Devs

Because, the Apache Spark project is THE project you need for all you data project, actually, even if the data is not distributed!

The training is three-days, hence the first day will tackle the core concepts, batch processing. We will start by introducing why Spark is the way to go then we’ll explain how to hit this road by hacking some datasets using it.

The second day will, in the same pragmatic and concrete manner, introduce the Spark Streaming library. Of course, we’ll see how to crack some tweets but also show how a real broker can be consumed, that is, Apache Kafka.

Not only will you be on track to use Apache Spark in you project, but since the Spark Notebook will be used as the main driver for our analyses, you’ll be on track to use interactively and proficiently Apache Spark with a shortened development lifecycle.

Title

Spark 4 Devs

subtitle Discover and learn how to manage Distributed Computing on Distributed Datasets in a Bigdata environment driven by Apache Spark.
 Date  23,24,25 September

Target audience

Mainly developers that want to learn about distributed computing and data processing

Duration

3 days

Location

European Data Innovation Hub @ AXA, Vorstlaan 23, 1170 Brussel

Price

500€ /per day /per trainee

audience

minimum 8 – max 15 people

Registration

via http://spark4devs.data-fellas.guru/

Motivation

http://blog.bythebay.io/post/125621089861/why-you-should-come-to-spark4-devs

Overview

via http://spark4devs.data-fellas.guru/

Learning Objectives

via http://spark4devs.data-fellas.guru/

Topics

Distributed Computing, Spark, Spark Streaming, Kafka, Hands on, Scala

Prerequisites

developing skills in common languages

About the trainer:

Andy

Andy is a mathematician turned into a distributed computing entrepreneur.
Besides being a Scala/Spark trainer. Andy also participated in many projects built using spark, cassandra, and other distributed technologies, in various fields including Geospatial, IoT, Automotive and Smart cities projects.
He is the creator of the Spark Notebook (https://github.com/andypetrella/spark-notebook), the only reactive and fully Scala notebook for Apache Spark.
In 2015, Xavier Tordoir and Andy founded Data Fellas (http://data-fellas.guru) working on:
* the Distributed Data Science toolkit, Shar3
* the Distributed Genomics product, Med@Scale

After completing a Ph.D in experimental atomic physics, Xavier focused on the data processing part of the job, with projects in finance, genomics and software development for academic research. During that time, he worked on timeseries, on prediction of biological molecular structures and interactions, and applied Machine Learning methodologies. He developed solutions to manage and process data distributed across data centers. Since leaving academia a couple of years ago, he provides services and develops products related to data exploitation in distributed computing environments, embracing functional programming, Scala and BigData technologies.

#datascience training programmes

Why don’t you join one of our  #datascience trainings in order to sharpen your skills.

Special rates apply if you are a job seeker.

Here are some training highlights for the coming months:

Check out the full agenda here.

Have you been to our Meetups yet ?

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,328 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

Event – SAPForum – Sept9 @Tour & Taxis, Brussels

Wednesday, Sep 9, 2015, 12:00 AM
2 Attending

Check out this Meetup Group →

Data Science Trainings Belgium

Datascience - Training calendar datascience training

scalaSpark  neo4j_logo_globe sqlLogo R spark mooc business-analytics-with-r-online-training

The European Data Innovation Hub facilitates  a full series of Data Science and Big Data training programmes organized by its partners.

You can expect

  • a series of executive training to support your management in understanding the benefits of analytics
  • a series of coached MOOCs on machine learning and big data technology
  • a series of hands-on training on the different datascience technologies

All members of the European Data Science and Big Data communities are welcome to use our Brussels based professional facilities to give their training. The members of the hub will promote your training and include it on our e-learning platform for further use.

The full list is available here.

Here are some highlights for the coming months:

Check out the full agenda here.

How to get the best price:

You can always use Eventbrite to order and pay for your ad-hoc trainings but if you want to benefit from volume discounts then you could contact Philippe on 0477/23.78.42 | pvanimpe@dihub.eu .

Have you been to our Meetups yet ?

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,608 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

IBM Bluemix and Analytics – Introduction

Tuesday, Feb 9, 2016, 6:30 PM
22 Attending

Check out this Meetup Group →

Coached Mooc – Introduction to Big Data with Apache Spark

mooc coaching  spark-logo

What

  • Learn how to apply data science techniques using parallel programming in Apache Spark to explore big (and small) data.
  • Study online but work in group
  • Get help from a local expert

Why we coach MOOCs

The European Data Innovation Hub is partnering with top experts to offer MOOC participants the possibility to do these online courses in group. During the duration of the Mooc participants will be welcome to come to the Hub in Brussels to work and to go through exercises with other participants. On specific days one or more domain expert will be present to coach the students.

Planning

  1. Sign up to this course here
  2. Join the meetup group here

About this course

Organizations use their data for decision support and to build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term Data Science. This course will attempt to articulate the expected output of Data Scientists and then teach students how to use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments include Log Mining, Textual Entity Recognition, Collaborative Filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (part of Apache Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.

What you’ll learn

  • Learn how to use Apache Spark to perform data analysis
  • How to use parallel programming to explore data sets
  • Apply Log Mining, Textual Entity Recognition and Collaborative Filtering to real world data questions
  • Prepare for the Spark Certified Developer exam

Meet the online instructor:

bio for Anthony D. Joseph

Anthony D. Joseph

Meet the coach:

Kris Peeters

Kris Peeters from Dataminded

Certificate

Pursue a Verified Certificate to highlight the knowledge and skills you gain ($50)

View a PDF of a sample edX certificate
  • Official and Verified

    Receive a credential signed by the instructor, with the institution logo to verify your achievement and increase your job prospects

  • Easily Shareable

    Add the certificate to your CV, resume or post it directly on LinkedIn

  • Proven Motivator

    Get the credential as an incentive for your successful course completion

Job opportunities ?

hidden-jobs1

Click here for Data related job offers.
Join our community on linkedin and attend our meetups.
Follow our twitter account: @datajobsbe

Have you been to our Meetups yet ?

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,239 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

Launch MOOC Coaching activities, First course is the Machine…

Thursday, May 28, 2015, 7:00 PM
15 Attending

Check out this Meetup Group →

Job – Big Industries – Hadoop Developer

Big-Industries-stamp-logo

Matthias Vallaey      Matthias Vallaey, Partner at Big Industries asked us to post following vacancy

Big Industries (a Cronos Company) works together with you to translate your ideas into workable Big Data solutions that will create measurable value for your organisation.
Implementing the solution using proven big data technologies from industry leading vendors, integrating only the most appropriate, effective and sustainable technologies to deliver best-in-class products and services.
Big Industries helps to assess, identify and integrate effective refinements in order to increase the value that big data solutions bring.
We are fulfillment partners for Cloudera and MapR, the premiere Hadoop distributions, for BeLux and offer expert consulting, systems integration and tailored application development with knowledge and experience across a broad range of industries.

Specialties

Hadoop, Big Data, Systems Integration, Consulting, HBase, Spark, MapReduce, SolrCloud, Impala, Kafka

Job Description

As a Big Data Developer you will work in a team building big data solutions. You will be developing, maintaining, testing and evaluating big data solutions within organisations. Generally you will be working on implementing complex and large scale big data projects with a focus on collecting, parsing, managing, analysing and visualizing large datasets to turn raw data into insights using multiple toolsets, techniques and platforms.

Soft skills

Team player – embraces change, able to adapt to working in varied software delivery environments. Can-do attitude, pragmatic, results-oriented – lateral thinker.

Mandatory experience & skills

  • Computing or Mathematics diploma, or 4 years experience active work experience within systems integration teams.
  • Thorough understanding of Java, and solid grasp of software development best practises.
  • Experience using hadoop and related technologies (eg. pig, hive, spark, impala), ideally with popular hadoop data processing pipeline patterns and technologies (cascading, crunch, oozie).
  • Willing to work to become Cloudera Developer certified.
  • Development exposure on both cloud and classic compute environments.
  • Very good Linux systems and Linux shell scripting knowledge.

Apply:

Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Here is the original jobpost .

Contact Matthias Vallaey matthias.vallaey@bigindustries.be (+32 496 57 66 27).

Data & The New Era of Interactive Storytelling–Strata+Hadoop 2014 (Video)

Why Interactive Storytelling becomes so key to understanding the data.

76% of the executives share data via email !?

Dashboards are not interactive.

What's The Big Data?

Data is an evolving story. It’s not a static snapshot of a point in time insight. With data from internal and external sources constantly updating, we are evolving from rear-view mirror dashboard views into an era of interactive Storytelling. Data Storytelling is both a visual art and a method of interpreting analytic results. Data Stories shed insights every minute, every hour, everyday, every week. This keynote will discuss how data dashboards are no longer adequate and how companies are using Interactive Storytelling to discover faster insights across many disparate data sources.

About Sharmila Shahani-Mulligan:
Sharmila has spent 18+ years building game-changing software companies in a variety of markets. She has been EVP & CMO at numerous software companies, including Netscape, Kiva Software, AOL, Opsware, and Aster Data. She drove the creation of several multi-billion dollar market categories, including application servers, data center automation and big data analytics. She is on…

View original post 22 more words