Quarantine@Home

cosmo · May 4, 2020, 4:10pm

Hey all!

I’m a bit of an amatuer on the computational chemistry side, but I threw together this opensource distributed project over the last month or so. It’s all opensource and has a handful of volunteers and about 500k docking results now.

We simulated both the main protease and spike protein. Here’s the results for the main protease, and note the tabs that let you switch between FDA cleared ligands vs “everything else”

Perhaps the only thing Ive contributed here is a fancy interface to view the dockings for each job :

The download links for the Autodock logfiles and ligand pose PDBQT files are on those pages, but I can definitely support bulk downloads if anyone wants them, including full SQL dumps.

Anyways, Im doing this mostly to learn. Happy to collaborate with anyone who wants specific things queued, or could make use of the dataset. The pipeline is all opensource too :

How can I be of help to the community?

mc-robinson · May 4, 2020, 7:41pm

Thanks @cosmo. Looks I guess my best suggestion would be to submit the top few compounds through the system and we are happy to send them through our triage process with many other results. Obviously putting the onus on us to sort through the most promising interactions out of the thousands is not ideal.

However, on another note, what might be very useful is if we could set up an automatic docking pipeline. Let’s say that a user wants to see their design docked. Or we want dock all of the new compounds each day. Can we find a way to let users submit the design for docking on this site, then call your pipeline and return an email with the PDB/viewing area and score? We were already considering a way to automaticallly dock the new compounds added here https://github.com/postera-ai/COVID_moonshot_submissions each day (by noting the newest additions in a separate folder). But it looks like you might have a lot of the infrastructure ready to do this?

Happy to chat more about this!

cosmo · May 6, 2020, 4:27am

Happy to submit the top compounds. I looked into this previously but paused because I was under the mistaken assumption SDF files were needed, but now I see your system will take SMILES strings. I can upload them!

Regarding submitting other molecules to the queue, I can definitely find a way. I think this would require some hefty API changes, but basically its a matter of finding the right PDBQT representations of the ligands.

I built this system around zinc, so if its a molecule already in zinc its a matter of me locating it in their very large database dumps. If its not in zinc, Id need to make some changes to the SQL schema to accommodate that, basically removing a primary key reliant on the zincID of said molecule. I’ll look at the drawing board for that.

Without a PDBQT structure for a desired compound, I’ll need to learn how to make those myself, perhaps from smiles strings. Ive got some studying to do!