Job – Infofarm – Big Data Developer


InfoFarm breidt uit en is op zoek naar een nieuwe Big Data Developer!


InfoFarm is een Data Science bedrijf dat zich toespitst in het opleveren van kwaliteitsvolle Data Science en Big Data oplossingen aan haar klanten. Onze naam danken we aan één van de vele informele brainstormsessies onder collega’s die spontaan tijdens de middagpauze ontstaan. Een gezellige sessie later hadden we de hele analogie met het boerderijleven op poten: we planten ideeën, we ploegen door onze klant zijn data, laten deze groeien met andere data of inzichten en oogsten business waarde door er verschillende (machine learning) technieken op toe te passen.

We hebben een uniek team met verscheidene talenten en verschillende achtergronden: Data Scientists (mensen met een onderzoek achtergrond uit een kwantitatieve richting, Big Data Developers (sterk technische Java programmeurs) en Infrastructuurmensen (de bits-and-bytes mensen). Wij ontwikkelen samen geweldige oplossingen voor onze klanten uit verschillende sectoren. Om ons team te versterken zijn we op zoek naar een Big Data Developer. 


Als Big Data Developer ontwikkel je voornamelijk Big Data applicaties op het Apache Hadoop of Apache Spark platform. Je werkt zelfstandig of in een gemengd team, ofwel in onze kantoren ofwel in detachering bij de klant. Je bent niet bang om met creatieve oplossingen voor complexe problemen naar voren te treden. De ene dag werk je voor een telecom bedrijf, om de dag nadien het waterzuivering systeem van België beter te leren kennen en ten slotte ook nog een Big Data applicatie in de logistieke sector te bouwen. Bij InfoFarm zijn geen twee projecten gelijkaardig, maar dat schrikt je niet af. Je kijkt er naar uit om bij te leren over verschillende businessen en om nieuwe ontwikkelingen en technologieën op de markt te volgen, alsook om  deze opgedane kennis uit te dragen naar onze klanten en binnen het team. 


Je hebt minstens 2-3 jaar ervaring met Java ontwikkeling. Certificaties vormen een meerwaarde.

Je kan werken met Maven, Spring of EJB en één of meer RDBMS.

Kennis van Hadoop, Hive en Pig zijn een pluspunt, net als kennis van Spark en Spark MLlib. Bereidheid om je te certifiëren in een van deze domeinen is noodzakelijk.

Kennis van R en Scala zijn een voordeel.

Je hebt op zijn minst een Bachelor in Applied Computer Sciences. 


Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Please note that we also manage other vacancies that are not public, if you want us to bring you in contact with them too, just send your CV to .

Bekijk de volledige job informatie hieronder en stuur als antwoord je CV naar!

(An English version can be requested via

check out the original post:

Job – NG-Data – Big Data Scientist – US and Belgium


Jo Buyl shared this job opportunity with us for a Big Data Scientist.

Job Description

In the era of Big Data, data is not useful until we identify patterns, apply context and intelligence. The data scientist, as an emerging career path, is at the core of organizational success with Big Data and for humanizing the data to help businesses better understands their consumer.

As a data scientist, you sift through the explosion of data to discover what the data is telling you. You figure out “what questions to ask” so that relevant information hidden in the large volumes and varieties of data can be extracted. The Data Scientist will be responsible for designing and implementing processes and layouts for complex, large-scale data sets used for modeling, data mining, and research purposes.


  • Be a true partner in defining the solutions, have and develop business acumen and bring technical perspective in furthering the product and business;
  • Aggregate data from various sources;
  • Help define, design, and build projects that leverage our data;
  • Develop computational algorithms and statistical methods that find patterns and relationships in large volumes of data;
  • Determine and implement mechanisms to improve our data quality;
  • Deliver clear, well-communicated and complete design documents;
  • Ability to work in a team as well as independently and deliver on aggressive goals;
  • Exhibit Creativity and resourcefulness at problem solving while collaborating and working effectively with best in class designers, engineers of different technical backgrounds, architects and product managers.

Personal Skills

  • You have a logical approach to the solution of problems and good conceptual ability and skills in analysis;
  • You have the ability to integrate research and best practices into problem avoidance and continuous improvement
  • You possess good interpersonal skills;
  • You are self reliant and capable of both independent work and as member of a team;
  • You are persistent, accurate, imaginative;
  • You are able and have the discipline to document and record results;
  • Be customer service oriented;
  • Be open minded and solution oriented;
  • You enjoy constantly expanding your knowledge base;
  • You are willing to travel up to five days per month.

Technical Background

The successful candidate should have 5+ years experience in large-scale software development, with at least 3 years in Hadoop. Have a strong cross-functional technical background, excellent written/oral communication skills, and a willingness and capacity to expand their leadership and technical skills.

  • BS / MS in computer Science;
  • Strong understanding of data mining and machine learning algorithms, data structures and related core software engineering concepts;
  • Understanding the concepts of Hadoop, HBase and other big data technologies; Understanding of marketing processes in the financial and or retail market;
  • Have a sound knowledge of SPSS and SQL


Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Please note that we also manage other vacancies that are not public, if you want us to bring you in contact with them too, just send your CV to .

Apply Today!

Upload your resume or send it to We look forward to your application!

Job – Big Industries – Hadoop Developer


Matthias Vallaey      Matthias Vallaey, Partner at Big Industries asked us to post following vacancy

Big Industries (a Cronos Company) works together with you to translate your ideas into workable Big Data solutions that will create measurable value for your organisation.
Implementing the solution using proven big data technologies from industry leading vendors, integrating only the most appropriate, effective and sustainable technologies to deliver best-in-class products and services.
Big Industries helps to assess, identify and integrate effective refinements in order to increase the value that big data solutions bring.
We are fulfillment partners for Cloudera and MapR, the premiere Hadoop distributions, for BeLux and offer expert consulting, systems integration and tailored application development with knowledge and experience across a broad range of industries.


Hadoop, Big Data, Systems Integration, Consulting, HBase, Spark, MapReduce, SolrCloud, Impala, Kafka

Job Description

As a Big Data Developer you will work in a team building big data solutions. You will be developing, maintaining, testing and evaluating big data solutions within organisations. Generally you will be working on implementing complex and large scale big data projects with a focus on collecting, parsing, managing, analysing and visualizing large datasets to turn raw data into insights using multiple toolsets, techniques and platforms.

Soft skills

Team player – embraces change, able to adapt to working in varied software delivery environments. Can-do attitude, pragmatic, results-oriented – lateral thinker.

Mandatory experience & skills

  • Computing or Mathematics diploma, or 4 years experience active work experience within systems integration teams.
  • Thorough understanding of Java, and solid grasp of software development best practises.
  • Experience using hadoop and related technologies (eg. pig, hive, spark, impala), ideally with popular hadoop data processing pipeline patterns and technologies (cascading, crunch, oozie).
  • Willing to work to become Cloudera Developer certified.
  • Development exposure on both cloud and classic compute environments.
  • Very good Linux systems and Linux shell scripting knowledge.


Make sure that you are a member of the Brussels Data Science Community linkedin group before you apply. Join  here.

Here is the original jobpost .

Contact Matthias Vallaey (+32 496 57 66 27).

Job – Netlog/Twoo/Stepout – Data Scientist – Gent

2_netlog_logo_full_bw stepout Logo twoo

Massive Media is the social media company behind the successful digital brands and In November 2013 Massive Media bought and relaunched the social discovery platform Stepout. We enable members to meet nearby people instantly. Over 100 million people have joined our sites on web and mobile. Check it out & apply  here.


Data Scientist- Massive Media MatchGent

We want to add some fresh talent to our data team to make sure it can fully continue its mission of turning the huge amounts of data we gather into gold.

Are you fascinated with big data technologies such as Hadoop and HBase?

Can you impress us with a solid technical background and substantial Python and SQL knowledge?

Are you familiar with the UNIX shell and common web technologies like Javascript and HTML?

Did you get blessed with a healthy interest in data visualisation, statistics and machine learning?

Does an agile and fast-paced development atmosphere sound like your perfect work environment?

Do you have the creativity, drive and discipline to get things done?

If your answer to all of these questions is “Yes, show me the data!” then we have a great job for you. Apply now and become part of an exceptional team of data scientists who are determined to teach you everything there is to know in one of the most exciting areas of computer and information science!

More Jobs ?


Click here for more jobs offers

Check out our twitter account: @datajobsbe

How Sears Became a Real-Time Digital Enterprise Due to Big Data

Original post here

Sears is a large retailer from the US that is a true Big Data pioneer for quite some years already. They have learned, made mistakes and achieved success by hands-on effort. It currently operates a very large enterprise deployment of Hadoop.

Sears was founded in 1893 and started as a shipping and mail order company. In 2005 it was acquired by Kmart, but continued to be operating under its own brand. In 2013 they had 798 stores and had revenue of over $ 21 billion. They are the fourth largest department store in the US and they offer millions of products across their stores. They have data of over 100 million customers, which they analyse to offer real-time, relevant offers to their customers. They are deep into Big Data and combine massive amounts of data to become a real-time digital enterprise.


Sears was ahead of its time, and its competitors, regarding Big Data. Already in 2010 they had a 10-nodes Hadoop cluster, which Walmart only reached in 2012. These days, Sears has a Hadoop cluster of 300-nodes that is populated with over 2 petabytes of structure customer transaction data, sales data and supply chain data. They used to have data in silos in many locations, but now their objective is to get all data in one place in order to achieve a single point of truth about the customer. But that’s not all; Sears applies Big Data also to combat fraud, track the effectiveness of marketing campaigns, optimize (personal) pricing, the supply chain as well as promotional campaigns.

Personalized Pricing

Sears combines and mixes vast amounts of data to help set (personal) prices in near real-time. Data from product information, local economic conditions, competitor prices etc. are combined and analysed using a price elasticity algorithm, which enables Sears to find the best price for the right product at the right moment in time and location via customized coupons. These coupons are given to loyal shoppers and are also used to move inventory if necessary. Just a few years ago this would have been still a dream scenario, as it used to take Sears up to 8 weeks to find the best price due to legacy systems, but nowadays this can be done almost in real-time.

In the past years, Sears went from nation wide pricing strategies to regional and now also personal pricing. The coupons that customers receive are based on where the customers live, the amount of products that are available as well as the products that need to go and which products Sears believes the customer will like and consequently will buy.

Shop Your Way Rewards loyalty program

In 2011, Sears launched a new loyalty program called the Shop your Way Rewards loyalty program. Also this program runs on Hadoop and that enables them to make use of 100% of the data that is collected. This results in better-targeted customers for certain online and mobile scenarios.

They key for Sears is to maximize multi-channel customer engagement through the loyalty program. Customers are providing their personal data in return for relevant interaction with that customer through the right channel, according to Dr. Phil Shelley, CTO at Sears Holdings Corporations, in an interview on Forbes.

Sears’ Big Data platform

In the past, Sears used all different kinds of tools on the data that was across the organisation in silos. These legacy systems prevented Sears from offering the right product at the right moment for the right price. Sears started by experimenting and innovating with Big Data, exactly as companies should when starting with Big Data. They began with a Hadoop cluster running on a net book computer and from there on they started experimenting. They have learned the hard way, through trial and error, among others due to the few outside Big Data experts they had that could guide them with the platform. They have managed to build a large centralized platform where all data is stored. The platform uses a variety of (open-source) tools such as Hive, Pig, Hbase, Solr, Lucene and MapReduce. This offers them all possibilities to have personalized interactions with the customer as well as use their data for different applications across the company.

Sears Big Data platform


Next to Hadoop, Sears also uses Datameer, a data exploration tool that enables visualization directly on top of Hadoop, for their ad-hoc queries, without the need for IT to be involved. Previously these jobs required ETL jobs that could take up to a few weeks. At the moment, Sears only gives their users access to Hadoop data via Datameer.

Sears started using Big Data because of declined sales, while major competitors such as Amazon kept growing. In the past years they have managed to move rapidly into the Big Data era and are turning their company into a real-time digital enterprise. A great achievement for a company that is over a century old.