PostEra

Ensemble-hybrid-oedocking: Ensemble hybrid docking to fragment-bound Mpro structures

Summary

I used an ensemble hybrid docking protocol (built with the OpenEye Toolkit 2019.10.2) to dock the 2020-05-03 snapshot of COVID Moonshot compounds.
The CYS145(-) HIS41(+) form of Mpro monomer was used for docking.

All fragment structures were used hybrid docking (fragment shape/color overlay followed by docking minimization) to all fragment structures successfully prepared by SpruceTK using multiple likely protonation and tautomeric states, and the best-scoring pose was selected.
For covalent inhibitors, a constraint was imposed to place the relevant heavy atom for any covalent warhead within 4A of the CYS145 SG atom, rejecting poses that could not satisfy this constraint.
These docked poses were then solvated and simulated for 10 ns, and the minimum and average distances of the closest warhead heavy atom and CYS145 SG were reported.

Scores

The Chemgauss4 and CYS145-warhead dist mean (A) should be uploaded to CDD.

  • Chemgauss4 is the Chemgauss4 docking score of the best scoring pose and protonation/tautomeric state to any Mpro structure; more negative is better, and fragments score ~ -10
  • docked_fragment is the name of the fragment (and corresponding Mpro fragment structure) with the best scoring hybrid docking pose
  • CYS145-warhead dist minimum (A) is the minimum distance (in Angstroms) between CYS145 SG and any covalent warhead heavy atom during 10 ns simulation, where distances < 4A indicate the warhead is potentially well-positioned for irreversible binding (covalent inhibitors only)
  • CYS145-warhead dist mean (A) is the mean distance (in Angstroms) between CYS145 SG and any covalent warhead heavy atom during 10 ns simulation, where distances < 4A indicate the warhead is potentially well-positioned for irreversible binding (covalent inhibitors only)
  • CYS145-warhead dist stddev (A) is the standard deviation (in Angstroms) of the distance between CYS145 SG and any covalent warhead heavy atom during 10 ns simulation, where large values may indicate the warhead samples a variety of distances from CYS145 (covalent inhibitors only)

Source and files

Edit: I’m unable to upload this file to fragalysis due to warnings of fragments and PDB structures that are unknown, but it can be downloaded here.

Edit: Covalent warhead distances have now been included.

All docking results will be posted here:

All scripts and receptor models are here:

Approach

Each fragment-bound X-ray structure provided by DiamondMX was prepared using the OpenEye Spruce Toolkit to clean up the structure and extract protein and small molecule structures with missing atoms added and missing loops either reconstructed or truncated.
CYS145 was deprotonated and the hydrogens reoptimized, which produced a doubly-protonated HIS41.

For each COVID Moonshot designed molecule, a set of reasonable protonation and tautomeric states were enumerated using OEGetReasonableProtomers from the OpenEye QuacPac Toolkit.
Each protomer was then expanded into a dense set of conformers using the OpenEye Omega Toolkit.
For each active site (non-covalent or covalent) and dimer interface fragment structure, one docked pose was generated using the binding site defined by the corresponding fragment using the corresponding X-ray structure.

Each multiconformer molecule was docked to the specified fragment site using the OpenEye OEDocking Toolkit using the Hybrid2 search method (which exploits shape overlays with the corresponding bound fragment) and the High search resolution.
The highest dock score across all docked structures was recorded, and the corresponding fragment structure noted.

4 Likes

Hi John,

why are docking to the monomer and not the crystallographically observed dimer.
Do you expect the target to be mostly monomeric in vivo, or do you also intend to interfere with dimerization, or to you thinks its not that relevant at this stage?

Best,
Alex

This is just a technical limitation for now. I can generate the dimers with SpruceTK, but there was a technical issue in generating the structural files for simulation on Folding@home, so I stuck with the monomer for now. (The ensemble docking also allows dimerization inhibitors to be scored this way.)

I hope to find a way to correct this in future refinement iterations!

@matteoferla made an interactive view of the results in Michelanglo here: https://michelanglo.sgc.ox.ac.uk/data/7d371acc-2931-4bd5-8e44-4426c864f85c

@MarkC:

To aid in selecting noncovalent compounds from the pool of submitted Moonshot compound designs, I have filtered the docked list to retain only compounds without covalent warheads (but including nitriles, since many of those designs were intended to have nitriles make noncovalent interactions).

The ranked list of noncovalent compounds (in Fragalysis 1.2 format) are here:
compound-set_ensemble-hybrid-oedocking-noncovalent.sdf.zip (1.7 MB)

Also on GitHub:

ROC for activity data

Based on the downloadable activity data as of Mon 18 May 2020, I made an ROC plot showing where in the assayed set the compounds with measurable IC50 < 99 uM data appeared. (@dmoustakas: Is there a better criteria to use here?)

Noncovalent compounds:
ROC - noncovalent

Covalent compounds:
ROC - covalent

All compounds:
ROC - all

From examining these ROC plots, it doesn’t look like the ensemble docking provides any significant benefits (and may even perform worse) than the single-structure OEDocking model selected from an MD trajectory from @dmoustakas presented in OEDocking docking/scoring/selection of submitted compounds (model #02084)

That model also has the added convenience of being very rapid to score since only a single receptor model—and not dozens—is needed.

Scripts and output files

All scripts and output files are here:

Docked moonshot submissions are here:

ROC curve generation is here: