In order to access the computation resources provided by IBM Watson AI,

Open the folloginw link: https://skills.yourlearning.ibm.com/activity/URL-F86CD1F49D47?ngo-id=0302
Select "Mark complete"
Register on IBM SkillsBuild via tour linkedin

3 datasets are provided in the csv (comma-separated) format
- train.csv (9415 rows)
- test_1.csv (750 rows)
- test_2.csv (478 rows)
Each row of a dataset corresponds to a molecule
Each csv file comports the following columns
- smiles : Chemical formula of the molecule in the SMILES format.
- 199 molecular features computed with the rdkit package (from column BalabanJ to qed). These features were computed with the rdkit package.
- ecfc_0000 to ecfc_2047 (2048 features) : bit vector representation of Morgan fingerprints
- fcfc_0000 to fcfc_2047 (2048 features) : bit vector representation of pharmacophore feature-based Morgan fingerprints
- class (train.csv only) : The label to predict (1 for hERG inhibitor, 0 otherwise)

Likely, optimal predictors will not use the complete set of 4295 features provided in the datasets.

rdkit is the most used package for processing molecules and computing molecular properties (e.g. molecular weight, charge, ...).
Molecular fingerprints are a commonly used features in the litterature on molecular predictions. They are the result of a local kernel application at multiple posiitons of the molecule, aggregated in a fixed length vector.
Pat Walters tutorial on cheminformatics present a wide variaty of ML baseline predictors using jointly rdkit, scikit-learn and other ML packages.

Basically any model or library you can find.

If you use any external model, you need to mention it in your submission file.