Takeaways from the Harvard-Stanford Symposium on Drug Discovery

Regina Barzilay — MIT

  • Representation
  • Generalization in Low Resource Training
  • Uncertainty Estimation
  • Mechanism Understanding


  • SMILES string → a string-based representation of molecules
  • ECFP Bit Fingerprint → A vector made of zero and ones. Each element in the vector represents the presence of a particular molecular substructure.
  • Rdkit descriptors → Rdkit descriptors are engineered features. That means they are specific properties extracted from the molecular structure as opposed to the structure itself. Descriptors include information on the molecule’s polarity, electron hybridization, and physical properties like LogP and hydrophobicity of the molecule.
  • Graphs → Graphs are simple network structures with atoms as nodes and bonds as edges.

Andreas Bender — AstraZeneca

Visual from Andreas Bender’s talk

Question → Data → Representation → Method

Model Validation is Process Validation

James Collins — BROAD Institute

Visual from Dr. Collin’s talk
  1. The molecule’s structural similarity to other antibiotics (using a tanimoto threshold)
  2. The molecule’s toxicity as according to ClinTox
  3. And most importantly, it’s predicted potency as an antibacterial

Yoshua Bengio — MILA

Visual from Dr. Bengio’s Talk. Graph ML applied on the host interactome and outside perturbations (drug synergies) provides more meaningful representations

Challenges and Future Directions



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mukundh Murthy

Mukundh Murthy

Innovator passionate about the intersection between structural biology, machine learning, and chemiinformatics. Currently @ 99andbeyond.