Hackathon – Process Manufacturing – Kick Off



This is the initial shout for this Process Manufacturing Hackathon that we are organizing with P&G, one of the major FMCG companies of the world.

Please block your afternoon of FEB 26 to attend in their pilot plant the presentation of the manufacturing process and the data structure that will be used during the Hackathon.

The presentation will be made at the pilot site of this big FMCG company located near the Heysel, Brussels. More details will follow soon as soon as the formal approval for communication is granted.

About this Hackathon:

We have one year of data from a production sites in Europe.

We will have the support of the process engineers that will explain us how the production process work and how the data is structured.

IBM will provide us their Bluemix environment and train us on 9/2 18:30.

RSVP for the launch and get all the detailed informations. We will meet 4 evening spread over 4 weeks (3/3, 10/3, 17/3, 21/3 ) in order to work on one of these 3 challenges:

  • Process Improvement
  • Nicer Dashboards
  • Out-of-the-box Sustainability concept

Join us on the launch and start building your team of experts right away.

Finalists will present their project at the DIS2016 event on March 23.



A formal registration to participate will be issued soon.

For now, block the date and register on meetup, more info will follow soon.

Each month we organize a Meetup in Brussels focused on a specific DataScience topic.

Brussels Data Science Meetup

Brussels, BE
1,609 Business & Data Science pro’s

The Brussels Data Science Community:Mission:  Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand cha…

Next Meetup

Hackathon – Process Manufacturing – Kick Off

Friday, Feb 26, 2016, 2:00 PM
9 Attending

Check out this Meetup Group →





Challenge – Online – Integra Gold Rush


Who wants to be part of our Team of Belgian gold diggers? ( please add a comment if interested)

I just received this exciting message from Canada. Should we put a team together, seems to be the perfect for real miners, no ?

Hello Philippe,

We are excited to launch the Integra Gold Rush Challenge this September – one of the largest ever mining industry incentive prize competitions to hunt down the next big gold discovery in Val-d’Or, Canada.

Opened in 1935, the Lamaque mine remained in operation for 50 years, processing more than 24 million tons of ore and 4.5 million ounces of gold. The Lamaque mine remained untouched and relatively forgotten for nearly 30 years, until Integra Gold purchased it in October, 2014.

Along with the purchase came 6 terabytes of information, spanning 75 years of mining history. This database is being turned to the public to help unlock its value through innovative and comprehensive solutions.

The Integra Gold Rush Challenge invites people from around the world, from any background to analyze this data and win prizes totaling CAD $1 million.

You may visit the challenge page to find out more and to register as an innovator. 

Please let me know if you would like to participate or know anyone else who might be interested in participating.



Join our next meetup on Sept 24th @VUB:

How Data Science is Transforming Sales and Marketing

Thursday, Sep 24, 2015, 6:30 PM

VUB – Aula QD
Pleinlaan 2B – 1050, Brussels Brussels, BE

149 Business & Data Science pro’s Attending

Agenda:18:30 monthly update by Philippe Van Impe• Official opening of the HUB with Alexander De Croo on 20/10• How to benefit from the training facilities from ‘European Data Innovation Hub’ – here is the overview of the trainings organized.• how to benefit from the DataScience co-working space• nice list of outstanding job opportunities19:00…

Check out this Meetup →

Treat yourself to some #datascience education:

Why don’t you join one of our  #datascience trainings in order to sharpen your skills.

Special rates apply if you are a job seeker.

Here are some training highlights for the coming months:

Check out the full agenda here.

Let’s participate to the DARPA Forecasting Chikungunya Challenge

orriginal post: DARPA Forecasting Chikungunya Challenge |  DEADLINE: 2/01/15  |  ACTIVE SOLVERS: 343  |  POSTED: 8/15/14

DARPA (Defense Advanced Research Projects Agency) seeks methods to accurately forecast the spread of chikungunya virus in the Caribbean, and North, Central, and South America.

This Challenge has a special award structure with awards of $150,000 and$100,000 for the top two overall Solvers and four honorable mention awards of$50,000 each. In addition, top Solvers in each Methodology Category (data, robustness, applicability, presentation, and computation) may win $10,000. The top six overall Solvers will be invited to DARPA for the Program Finale Meeting where they will participate in an interactive meeting to share best practices, collaborate, and facilitate continuing Solver community cohesion.

This is a Reduction-to-Practice Challenge that requires written documentation and multiple submissions of forecasts for the virus’ spread. Additionally, as a Prodigy Challenge an online leaderboard will be available to track Solver performance.

Privacy Advisory

This web site is hosted by a private entity and is not a service of the Defense Advanced Research Projects Agency (DARPA) or the Department of Defense (DoD). The solicitation and collection of your personal or individually identifiable information is subject to the host’s privacy and security policies and will not be shared with DARPA or the DoD unless you win the Challenge. Challenge winners’ personally identifiable information must be made available to DARPA in order to collect an award. Please consult the Challenge Specific Agreement.

Source: InnoCentive      Challenge ID: 9933617
Challenge Overview

Chikungunya virus (CHIKV) has recently been detected in the Western Hemisphere. Previously, the virus had not been detected in the Americas for many decades. This DARPA Challenge seeks methods to forecast outbreaks and the potential spread of CHIKV throughout the Americas. This Challenge also seeks to develop forecasting capabilities for infectious diseases, with the intent of applying these capabilities to the mitigation of infectious diseases outbreaks.

There will be nine submissions to this Challenge distributed throughout the Challenge period:

  1. Methodology. An initial submission containing a detailed description of the planned data sources and model applicability and a final submission with a detailed description of the full methodology used for the forecasts are required. The initial submission must be received by September 1, 2014 in order to be eligible for points. The final submission is due February 1, 2015. Both initial and final methodology submissions should be completed, although only those submitted by the due dates will be considered for points. The project documentation should include a well-articulated rationale for the methodology and choice of data sources.
  2. Accuracy Forecasts. An initial forecast submission, due September 8, 2014, with predictions for the next six months, followed by five monthly update submissions, due on the 1st of each subsequent month, with predictions for the remaining period of the Challenge.
  3. Peak Forecasts. A forecast of Peak New Cases per Country or Territory, dueOctober 1, 2014.

Solvers are encouraged to submit all nine deliverables outlined above. However, submissions are accepted throughout the Challenge. Late submissions will not be eligible to receive points associated with the deliverable.  Solvers will make a new submission that includes all deliverables due on a particular date rather than updating a previous submission.

Awards are contingent upon evaluation and validation of the submitted Solutions by the Seeker.

This Challenge has a special award structure with awards of $150,000 and $100,000 for 1st and 2nd place, respectively. The next four top overall Solvers will receive awards of $50,000 each. In addition to winning awards for the highest overall points, top Solvers in each Methodology Category (data, robustness, applicability, presentation, and computation) may win $10,000. The top six overall Solvers will be invited to DARPA for the Program Finale Meeting where they will participate in an interactive meeting to share best practices, collaborate, and facilitate continuing Solver community cohesion.

DARPA claims no rights to intellectual property developed by Solvers as a result of participation in the CHIKV Challenge. DARPA may negotiate a license for the use of intellectual property developed by a Solver.


The CHIKV Challenge is open to academic institution, business, or individual (18 years of age or older). A Solver may be an individual competing alone or a team representing an academic institution, business, or group of individuals. Only one submission per team per deliverable should be submitted.

Foreign Participation

Non-U.S. organizations and/or individuals may participate to the extent that such participants comply with any necessary non-disclosure agreements, security regulations, export control laws, the CHIKV Challenge Rules and other governing statutes applicable under the circumstances. Employees of foreign governments are not eligible to participate in this Challenge.

Pan American Health Organization (PAHO) and Member State Reporting Institutions

Individuals affiliated with the Pan American Health Organization (PAHO) and its Member State Institutions that provide health surveillance data cannot participate in the Challenge in any capacity.

Federally Funded Research and Development Centers (FFRDCs)

FFRDCs are encouraged to participate in this Challenge but are not eligible to receive any prize award. In order to participate, FFRDCs must provide a letter on official letterhead from their sponsoring organization citing the specific authority establishing the FFRDC’s eligibility to participate in Government Challenges. Individuals that supported the development of the Challenge are not eligible to participate.

Other Eligibility Requirements

DARPA employees and DARPA support contactors, including spouses, dependents, and household members, are not eligible to participate in the CHIKV Challenge. Federal employees acting within the scope of their employment are not eligible to participate in the Challenge. Federal employees acting outside the scope of their employment should consult their ethics official and appropriate management before participating in the Challenge. Federal employees may provide subject matter expertise to participants so long as they grant equal opportunity for access to each participating team or individual.

Any entities and personnel funded by DARPA to support the CHIKV Challenge are not eligible to participate in the CHIKV Challenge.

About the Seeker

Since its establishment in 1958, DARPA has demonstrated time and again how thinking beyond the borders of what is generally deemed possible can yield extraordinary breakthroughs. The Agency’s mission is to foster and demonstrate revolutionary new technologies and capabilities that provide practical options for sustaining U.S. security into the future. Importantly, DARPA conceives national security broadly, and the technologies it creates frequently transition either directly or indirectly to the commercial world, where they have bolstered such critical sectors as healthcare, transportation, communications, and computing. Many of the technologies people depend on today have their origins in DARPA-funded research.

To achieve its ambitious goals, DARPA supports world-class teams of experts from academia, industry, and government laboratories, empowers them with resources, encourages them to take risks, and provides them with the flexibility to transcend conventional organizational constraints. It isn’t just established research institutions that can contribute, though. DARPA recognizes that the novel capabilities it seeks may, in many instances, emerge from novel sources—including individuals or consortia that have never contributed to government research efforts or considered how their expertise might be applied to the national security domain.

It is in recognition that extraordinary solutions can emerge from unconventional sources that DARPA periodically launches prize-based challenges—ensuring that the full diversity of America’s innovative potential is brought to bear toward the goal of achieving a better and more secure future.

Challenge: M&A: predict possible future acquisition

M&A: predict possible future acquisition

According to Reuters 2014 will be marked by the growth of the M&A market


Mergers and acquisitions (M&A) is a class of economic processes of consolidation of business and capital occurring at the macro and micro levels, which result in appearance of a larger company instead of several smaller.

Acquisition is a bargain performed in order to establish control over a company by acquiring more than 30% of the share capital (stocks, shares, etc.), while maintaining judicial independency.

Solvers are invited to forecast the probability of the company being acquired in the coming year.

Time-Line of the competition

  • 09.06.2014     start of the competition
  • 22.08.2014     results of the competition and award winners ($5000 prize fund). Evaluation is based on the 2 criteria: evaluation by the functional (see below) and expert judgment (decision simplicity, reproducibility of algorithm, expert opinion). Both criteria are equivalent in choosing the winners.
  • 01.03.2015     the final results of the competition and award winners ($10,000 prize fund). By March 2015 the company’s financial year will be over, so the model will be tested on real data about what companies have been acquired. The criterion for awarding the main prize is this very objective data.

Data Description

Data is provided for the task in tables containing information about the companies on the following parameters:

1.  Cash and cash equivalents (columns 1–21)

2.  Inventories (columns 22–42)

3.  Total Current Assets (columns 43–63)

4.  Total Current Liabilities (columns 64–84)

5.  Total Assets (columns 85–105)

6.  Property, Plant and Equipment, Net (columns 106–126)

7.  Goodwill (columns 127–147)

8.  Short-Term  Debt (columns 148–168)

9.  Long-Term Debt (columns 169–189)

10.  Net Debt (columns 190–210)

11.  Total Liabilities (columns 211–231)

12.  Depreciation  and amortization (columns 232–252)

13.  CAPEX (columns 253–273)

14.  Net Sales (columns 274–294)

15.  Gross Margin (columns 295–315)

16.  EBITDA (columns 316–336)

17.  Dividend yield (columns 337–357)

18.  Market  Capitalization (columns 358–378)

19.  Gross Income (columns 379–399)

20.  Financial Costs (columns 400–420)

21.  Net Income (columns 421–441)

22.  Book Value (columns 442–462)

23.  Free Cash Flow (columns 463–483)

24.  Sector (columns 484)

All data is divided into three files:

  • File Train_contest.csv – training sample in which data is available for all the above parameters except  for the sector for the years 1994-2014 (21 values ​​for each attribute), and there are two columns with the answers:
    • Column with a binary value, showing whether the company was acquired;
    • Column in which, if the company was acquired, the date of the news about the acquisition is given, and otherwise  the column is blank;
  • File Valid_contest.csv – validation sample for which all parameters are unknown (replaced by NaN), starting with some of the X year. For this sample it is required to predict the probability of the news about the acquisitions during this year of the X.
  • File FinalTest_contest.csv – final test sample, which is structured like the validation one, however, the companies in these samples (validation and final test) are different. The result obtained by using the participant’s algorithm on this sample will be the result of his performances in the competition.

*It should also be noted that if the NaN value is met “inside” of the parameter (e.g., in some row in column 236 stands numeric value, in the column 237 – NaN, however, columns 238–252 have numeric value numeric), it must ne taken as lacuna in data and can not be interpreted as the year X, which has to be predicted. In addition, for a clearer understanding of the situation prevailing at some particular moment in the market, as additional information, participants are given weekly quotation S&P500 index for the 1994-2014 year.

The most popular economic indicators

Based on the available data, participants can figure out and use the following economic indicators:


Functional for evaluation solutions

Evaluation of submitted solutions will be done using the tool getQualityOfSolution (Answer, Ideal) (written in a programming environment MatLab). The idea implemented in this functional is based on the normalized least quality ranking objects (normalized Discounted Cumulative Gain, nDCG), which can be expressed by the following formula:


where SortedIdeal – vector of correct responses, sorted in descending order of probability in the Answer, and OnesZerosIdeal – descending sorted vector of correct answers (first go all the units, followed by all zeros).

!!!Important note: in case of coincidence of predicted probabilities for a certain set of companies, after sorting they will go in the same order as the vector Answer!!!

Example calculation of solution quality

If the vector response-probability is (0.5, 0.8, 0, 0), and the ideal vector responses-acquisitions is (1, 0, 1, 0), then the quality of the solutions can be calculated using the formula:



How to take part:

Begin to address the problem