Welcome to the MARGO Cheminformatics Hackathon, an exciting competition dedicated to advancing the field of bioinformatics through machine learning tools!

Organizers

MARGO - Main organizers
Qubit pharmaceuticals - Hackathon topic designers & chem-informtics expertise
IBM - Provision of computing resources via the Watson-x platform

About the challenge

In this hackathon, participants will focus on building a heart-toxicity predictive model, a crucial challenge in drug discovery and chemical safety assessment.

By leveraging machine learning techniques, you'll work with real-world datasets to create models that can predict how chemical compounds might interact with biological systems, potentially identifying toxic compounds before they reach clinical trials or industrial applications.

This hackathon provides a unique opportunity to apply your data science and machine learning skills to solve complex biological problems and contribute to public health and safety. We are calling on bioinformaticians, data scientists, and ML experts to collaborate, innovate, and push the boundaries of predictive modeling.

The goal is simple: given a set of molecules labeled as toxic (1) or non-toxic (0), participants are expected to tackle the 3 following tasks:

(Task 1) - Predict the toxicity of a uniformly sampled set of molecules, denoted as test set 1.

(Task 2) - Predict the toxicity of 6 series of molecules. In the drug discovery universe, a molecular series is a family of molecules that share a common global structure, only differing in fragments. These molecular series make up the test set 2.

(Task 3) - Among the predictions of test set 1, select the 200 molecules for which predictions are the most reliable.

These tasks are far from trivial, and there are many skills that can help you in your quest towards molecular toxicity estimation:

Classic ML skills (python & associated libraries)
Statistics
Computational and organic chemistry
Graph theory

The accuracy of predictions will not be the only criterion taken into account when evaluating results. We encourage you to explore creative and innovative solutions!

Evaluation metrics

The following metrics will be used to assess the quality of the predictions:

(Task 1) - Cohen kappa score on test set 1.

(Task 2) - Accuracy on each series of test set 2 (total of 6 accuracy metrics). These 6 scores wil be averaged to a single evaluation metric.

(Task 3) - Accuracy on the first 200 rows of submission file for test set 1.

Leaderboards

Each task leaderboards can be found here.

Submission impact the leaderboards only after judges have validated their content.

Requirements

What to Submit

Submit your source code in a compressed file format (zip).
Ensure that your code is well-documented, including clear instructions on how to run and test your model.

In addition to the code, you may include at the root of your zip file your predictions with the following format:

pred_1.csv

smiles	class
<molecule-SMILES>	1
...	...
<molecule-SMILES>	0

pred_2.csv

smiles	class
<molecule-SMILES>	0
..	...
<molecule-SMILES>	1

If provided, these predictions will be used to evaluate performance metrics on the hackathon tasks.

Prediction quality metrics

Each of the 3 tasks will be estimated using either the classification accuracy, either the cohen kappa score

(task 1) - Accuracy of the predictions in pred_1.csv.

(task 2) - Per-series accuracy of the predictions in pred_2.csv, averaged over all the series.

(task 3) - Accuracy of the first 200 molecules (rows) in pred_1.csv.

Judging Criteria

Your submission will be evaluated based on the following criteria:

Model Performance on task 1 (20%)
Model Performance on task 2 (20%)
Model Performance on task 2 (20%)
Innovation & Approach (30%)
Reproducibility & Documentation (10%)

Hackathon Sponsors

Prizes

€1,600 in prizes

1st Place

€250 in cash

4 winners

2nd Place

€100 in cash

4 winners

3rd Place

€50 in cash

4 winners

Judges

Antoine Mazarguil
ML expert / MARGO

Judging Criteria

Task 1 performance
Task 2 performance
Task 3 performance
Innovation & Approach
- Does the approach seems relevant - Did the students neglect some important phenomenons ? (Biais, overfitting, ...)
Reproducibility & Documentation
- Description of the preprocessing / method - Clear specification of the tools (pqckages, ML algortithms, ...) - Clarity of the code

Telecom Paris	Public
€1,600 in cash	67 participants

MARGO Telecom-AI Hackathon 2025

MARGO Telecom-AI Hackathon 2025

Heart toxicity predictor for molecules

Who can participate

Organizers

About the challenge

Evaluation metrics

Leaderboards

Requirements

What to Submit

Prediction quality metrics

Judging Criteria

Hackathon Sponsors

Prizes

1st Place

2nd Place

3rd Place

Devpost Achievements

Judges

Judging Criteria