Data Release: 2020-05-10

Dear COVID Moonshot Community,

We are happy to release results from our first round of screening. The data so far includes 28 hits with over 50% inhibition at 50 μm, 8 hits with IC50 <10 μM, and over 20 new crystal structures of compounds bound to the main protease. Since you’ve made it to this post from Twitter or our email, I’ll assume you want more details!

1. Testing for main protease activity?
First, a big thank you to @londonir and Haim Barr at Weizmann who worked so hard to get this assay up and running.

We just got new results in today, and even more IC50 values will be back by Tuesday. All results (including Trypsin counterscreen and solubility data obtained at Enamine) can be viewed on the PostEra website here: https://covid.postera.ai/covid/activity_data

Procedure:
Compounds were seeded into assay-ready plates (Greiner 384 low volume 784900) using an Echo 555 acoustic dispenser, and DMSO was back-filled for a uniform concentration in assay plates (maximum 1%). Screening assays were performed in duplicate at 20 µM and 50 µM. Hits of greater than 50% inhibition at 50 µM were confirmed by dose response assays. Reagents for Mpro assay reagents were dispensed into the assay plate in 10 µl volumes for a final of 20 µl. Final reaction concentrations were 20 mM HEPES pH=7.3, 1mM TCEP, 50 mM NaCl, 0.01% Tween-20, 10% glycerol, 5 nM Mpro, 375 nM fluorogenic peptide substrate ([5-FAM]-AVLQSGFR-[Lys(Dabcyl)]-K-amide). Mpro was pre-incubated for 15 minutes at room temperature with compound before addition of substrate. Protease reaction was measured continuously in a BMG Pherastar FS with a 480/520 ex/em filter set. Data was mapped and normalized in Genedata Screener prior to loading to CDD vault.

Overall, we have been pleasantly surprised by the reproducibility of the assays. Here are two examples:
TRY-UNI-714a760b-6


which has two IC50 curves shown here, which agree quite well.

MAT-POS-916a2c5a-1


again has dose-response curves that agree nicely.

In terms of testing, we also plan to perform thiol reactivity assays in the next few days for the ~80 covalent compounds tested so far.

  1. Getting Structures

A huge thank you to @frankvondelft, @Daren_Fearon, @AnthonyA, @Waztom , @reskyner and the whole Diamond UK team who has been working on getting these amazing structures. The speed at which this team has obtained over 20 new structures beyond the initial fragments is incredible. You can browse the structures on Fragalysis or download all structures from the top of the page as shown on the right of the screenshot below

We have also collected many other structures from the literature which may be of use in designs here: https://covid.postera.ai/covid/structures#literature_SARS-COV-2

3. What’s Next?

We are now in the process of synthesizing and assaying another 200 compounds with plans to have hundreds of more follow up compounds to expand and optimize the hits that we have obtained. As a recap, we’ve received over 4600 unique designs and selected 700 compounds to make and test. In the first wave we focused on compounds that could be rapidly synthesized. We rapidly triaged compounds for synthetic feasibility and sent our first batch of compounds with synthetic route designs at the start of April (PostEra), and have successfully completed synthesis (Enamine), main protease biochemical assays (Weizmann Institute) and X-ray crystallography (Diamond/Oxford).

From you all, we hope to continue obtaining insights and designs as we continue to look for promising leads. Expanding upon the current actives is great, and @frankvondelft has even set up a category seeking suggestions on how to best merge and enumerate the current fragments with structures. We are also setting up the infrastructure to better gain insight from community simulation results.

We now have over 20 synthetic chemists working on the project between the great folks at Sai Life Sciences and Enamine. Let’s make some stuff!

And for those of you who want to pay for your own designs to be synthesized and/or tested, please see this post (and make sure to respond to the post if your are sending compounds for synthesis/testing).

Some other highlights
I haven’t actually gotten a ton of time to look at the data (beyond checking it is in the right format). My week has been filled with mostly data processing, logistics, and some web development. But I’ll just point out a few things that I think may be of interest.

  • Covalent compounds: A lot of the early hits are chloroacetamides or Ugi compounds with an acrylamide warhead. Some work needs to be done to disentangle K_i vs K_{inact}. @RGlen, @BAL, and folks at Boehringer Ingelheim are doing some work to help probe this problem computationally!
  • 3-amino pyridine non-covalent hits Some of the med-chemists who saw the data first, noted that the ligand efficiency of this series of non-competitive inhibitors is quite good for a challenging protein like this.
  • MAT-POS-bfefc3ea-3, which is Z LVG CHN2, a cysteine protease inhibitor from the ReFrame library tested in https://www.biorxiv.org/content/10.1101/2020.04.16.044016v1.full.pdf, did not show great activity in our assays (10.4 ± 2.36 % inhibition at 50 μm). However, it still showed promising results against the virus in the original paper. We thought it was worth testing if it was binding to MPro, but I guess we’ll just let it remain a mystery of biology :man_shrugging:.

Questions
I’m sure many of you have questions. Please reply to this thread if so, We’ll try to respond, though sending this out near midnight my time (PST) is perhaps a terrible idea!

Thanks
And lastly, thanks to all of you in the community who continue to make this initiative what it is.

-Matt, and the rest of the PostEra team helping out on this project (@alphalee, @aaron.morris, @ajajack, @milan.cvitkovic)

7 Likes

Hi @mc-robinson,

Exciting data and great way to visualize / report it!
It’s possible I’ve missed it but is there some way to export or download the data as csv or sdf, to enable review and analysis in a separate software? The ideal would be a single file (csv with smiles or sdf) which doesn’t just contain the compounds tested and their data but also the designed compounds as well (i.e, with empty data values) - would allow for example swift identification of untested designs close to the actives via
Also, just a cosmetic suggestion, for the sorting ascending / descending aspect of each column is it possible to ignore empty values and always put empty values at the end of the sorted table? Right now it seems to be classifying empty values as = 0 so if you sort ascending you’ll always get the empty values first… for example there is no quick way to jump right to seeing what is the compound with the lowest IC50…need to sort and then scroll through pages until a compound with a value appears.

Ben

2 Likes

Thanks, @Ben_DNDi. Great suggestions. Yes, two great points you bring up, and we will try to implement those features soon.

As for data download, your best bet is still the Github repo here https://github.com/postera-ai/COVID_moonshot_submissions
(we are working on giving some public access to our CDD Vault too, but that still needs a bit of cleaning and work on my side first!).

COVID_moonshot_submissions/covid_submissions_all_info.csv contains all designs including duplicates (and is consequently rather large). The one thing it does not contain right now is experimental data. But let me get on that and get you a new file.

As for the sorting problem, very good catch! That is an annoying default, and I’ll look to fix it.

1 Like

I’ll second Bens request and add another to the list if I may…
It would be nice to tag which fragments have been crystallised
Thanks

Sure @krisbirchall, but just for clarification, is this what you are looking for?
https://covid.postera.ai/covid/structures

Its good to have such a list, but I was more thinking about in the compound tracker and individual compound pages, otherwise people will have to keep cross-checking between pages

Another useful feature would be to have a tag for compounds that have been attempted in crystallography and what the outcome was - it would help us to see if there are new compounds that have not gone into crystallography but would be worthwhile to try. In fact on that note it would be good to provide a list of ‘closest to crystallised’ for each design to help identify the most suitable frameworks for building on the design (or if the compound is sufficiently novel to warrant crystallography).

Sorry for all the requests - there’s so many things to be done so of course it’s a question of time and priority

1 Like

Good point!

Yes, I think adding more info about what crystallized and what didn’t is a great idea. Obviously, we have a long list of things to start building… but we are getting there!

I’ll have this more automatic in a day or two, but if you want to start exploring the data. Here is all of the activity data so far. CSV also includes all of the unique designs without datamoonshot_initial_activity_data.csv (307.5 KB)

1 Like

Great to see these data being here and will certainly take a look. Will the next wave also contain alternative warheads to the chloroacetamide, as well as some testing of previously-reported inhibitors? Will be nice to know where we stand there. Congrats to the team!

2 Likes

Hi @JoostU, Great questions!

Yes, we definitely expect to explore alternative warheads in coming rounds of synthesis. Actually, quite a lot of acrylamides have been ordered as a part of Ugi hit expansions such as here https://covid.postera.ai/covid/submissions/adc59df6-a3dd-4d89-ae76-550f19fbfbe3 Last I checked, the feeling is that those hits may still be mostly K_{inact} driven, and some tuning needs to be done.

And I expect that is the feeling with many of the hits showing lower IC50 too.

As for past inhibitors, quite a few are being made! If memory serves, I believe we have at least 6 synthetic chemists on past inhibitors for the next two weeks. There is a lot of interest in getting them into structures as soon as we can. And I have also put through orders for the Real Space compounds you so kindly helped me look into here A brief exploration of past SARS small-molecule inhibitors

I just wanted to flag up a couple of points.

These certainly look interesting. One problem with the ligand efficiency metric is that perception of efficiency varies with the concentration unit in which affinity (or potency) is expressed. The nature of ligand efficiency discusses the problem and suggests a solution (which could also be applied to other measures of activity such as k_inact/K_i):

1 Like

I’ve been taking a look at the moonshot_initial_acitivity_data.csv list. Why are MAK-UNK-6435e6c2-3 and MAK-UNK-6435e6c2-2 missing from the list of IC50 measurements? The AVG % inhibition at 20 and 50 µM is over 50…

1 Like

Hi, @vvoelz, thanks for pointing this out. These compounds with that high inhibition are actually the corresponding chloroacetamides see below, and they were mislabelled with the wrong Id when they came back from synthesis. I apologize that they made there way into the system still (will correct that soon), but the IC50 data below should be correct.

Thanks @pwkenny, yes, the computational work is mostly on predicting “hot” warheads in order to know what to make next, while of course some real assays need to be done in the future. My understanding is that thiol reactivity assay should at least give us some idea how generally reactive these things are.

And thanks, I’ll look into the paper. Though, I don’t think anybody is relying on it too much here. I was using it pretty crudely in a “small thing bind not bad” kind of way.

-Matt

I thought I’d best cross-post. I posted a modification of the submission data from this release, but with combinations of different warheads both in reacted and unreacted forms here: Covalent attachments with "*"

Very nice work, thank you! :slight_smile:

A small suggestion from me on data visualisation, if I may take a stab at the “wishlist” (like @krisbirchall said, there are many other things to be done, and this is just something I thought of while playing with the data) :smile: :

To indicate in a column as to whether a compound is a reversible inhibitor or irreversible inhibitor. My guess is that most chemists currently determine this by checking for the warhead; for instance, chloroacetamides might count as irreversible inhibitors.

In the same vein, compounds could be differentiated by whether they are competitive or non-competitive, so we can quickly see whether they all bind (or are predicted to bind) to the active site. Or, in non-competitive cases, we can also see which binding sites they are binding to.

Thanks for the suggestion @Zhang-He, yes the covalent/non-covalent issue is one we are constantly dealing with. We are currently doing thiol reactivity screens to understand the reactivity, but it is still a “murky” issue. Obviously, the chloroacetamides are mostly very reactive; however, the acrylamides are kind of on the edge of irreversible/reversible which is an issue in need of elucidation on our end.

As for competitive / non-competitive, I believe most everything in this project is aimed at targeting the active site (though a few of the crystals do indeed show lurking into other sites)

1 Like

If labeling inhibitors as reversible/irreversible then it would be necessary to say whether the label refers to a warhead-based classification (assumption) or the result from a reversibility assay (measurement).

1 Like

Many congratulations on this great work! The results clearly resolve between compounds with different potencies and there are IC50s as low as 1.8 uM. I am only an enzymologist, but to me several compounds look very interesting with good ligand efficiencies and clogP values. Unfortunately, I have been unable to determine whether there are crystal structures for complexes with these compounds.

Seventeen out of the 18 dose-response curves give slopes greater than 1, possibly because there is some positive co-operativity.

Why do compounds with covalent warheads appear to give reversible inhibition?

Several of these inhibitors have covalent warheads. However, the kinetic data strongly suggest reversible inhibition (because IC50s are well above the 5 nM enzyme concentration and the data don’t show a very steep dose-response which is characteristic of irreversible inhibition). Possible reasons for these compounds not giving irreversible inhibition are:

  1. The warheads are not positioned correctly relative to Cys145, or have insufficient reactivity.
  2. Covalent inhibition occurs, but is reversible.
  3. The compounds react with something (maybe TCEP?) and inhibition is by a product.

For med chem progress, it seems important to elucidate what is happening. Crystal structures will show if there is covalent bond formation.

I agree with Pete Kenny that direct kinetic studies are needed to characterise any irreversible inhibitors.

Experimental protocols

When using recombinant enzyme, it is important to be sure the assay is following the correct activity, rather than a contaminant. I assume these assays do follow MPro, but it would be re-assuring to confirm that SAR from IC50s is in agreement with crystal structures.

For MPro from SARS-Cov, valid kinetics require a full-length construct with authentic N- & C-termini (Xue et al, 2007, J Mol Biol 366, 965). Given the 96% sequence identity, I expect MPro from SARS-Cov2 has similar requirements. The MPro construct used for these IC50s is not stated.

A similar substrate against a similar enzyme gave a Km of 40 uM (MPro from SARS-Cov, Xue et al 2007), and the data from Nir London & Haim Barr used 375 nM substrate. (So not many turnovers for 5 nM enzyme, especially given rate tends away from linear after using around 10% of the substrate.) I often recommend using substrate at its Km value to balance sensitivity between competitive, noncompetitive and uncompetitive inhibitors. This assay, therefore, probably disfavours uncompetitive compounds. The same bias occurs when screening the free enzyme by crystallography and mass spec. This may be acceptable because uncompetitive inhibitors probably are rare.

Choice of reducing agent

It can be difficult to optimise assays for enzymes involving a catalytic Cys residue, balancing the need to prevent oxidation with the possibility of reducing agents perturbing the results. Using “papain-like” protease from SARS-Cov, Lee et al (2012, Analyt Biochem 423, 46) determined kinetic parameters, hit rate and IC50 values to compare no reducing agent with the physiological reduced glutathione (GSH), DTT and TCEP. It was recommended to use 5 mM GSH. I have been involved with a project, which found similar discrepancies with a cathepsin. I usually recommend use of 5 mM GSH, but it needs to be a fresh solution because it is limited stability. A stronger reducing agent may be required to store the stock enzyme solution.

The studies of Nir London & Haim Barr use 1 mM TCEP, which could affect potency of some inhibitors.

GSH stability data would be informative

It is advisable to measure stability in GSH, to ensure active compound is not depleted excessively inside cells. Poor stability in GSH (or TCEP) also could compromise isolated enzyme data, because IC50 measurements assume that inhibitor concentration is constant throughout the assay.

Suggestions for future work

I feel uncomfortable recommending further work, because I know everyone has so much to do and has already completed. Some of these suggestions have already come from other people:

• In the database, link IC50 and crystal structure on a compound by compound basis.
• If not already studied by crystallography, med chem may be helped by directly investigating whether active compounds with covalent warheads actually do give reversible inhibition (eg by dilution) or covalent inhibition (eg by mass spec).
• Stability in GSH should be measured for active compounds that contain covalent warheads.
• Kinetic characterisation (measurement of Ki and kinact) for compounds giving covalent inhibition.
• Consider whether the team are comfortable with the prospect of missing uncompetitive inhibitors.
• If any apparent inconsistencies develop with other data (eg activity in cells), it may be worth testing compounds using GSH rather than TCEP.

Once again, thanks for performing and sharing this excellent work.

Wal

3 Likes

@mc-robinson
We are doing some Mpro screening in the US DOE national labs, and started out using Bachem’s EDANS/Dabcyl substrate but think your fluorophores look better. We are looking for clarification on your fluorogenic peptide [5-FAM]-AVLQSGFR-[Lys(Dabcyl)]-K-amide - is it K(Dabcyl) and the another N-terminal K, or is K(Dabcyl) the terminal residue?

1 Like