Cross-referencing the Project Moonshot compounds

It is sometimes difficult to cross-reference compounds between multiple sources (PostEra and Diamond) so I’ve downloaded the compounds with associated data calculated InChiKeys and then used the InChiKey to link compounds from different sources within Vortex. This means you have the biochemical data together with PDB code (if available) or the fragalysis code for the crystal structure. I’ve also annotated with identifiers from multiple databases (ChEMBL, PubChem etc.), calculated physicochemical properties (LogP/D, TPSA, HBD/A etc) and then exported in sdf format. I’ve also clustered the structures to aid navigation. You can download the file here https://www.cambridgemedchemconsulting.com/news/index_files/334893d80c56ec5ad7e3f243c59f39bc-442.html

2 Likes

Hi @drc007, sorry you found it difficult to cross-reference. Let me know if you need help getting any of the data. If you are adept at working with Pandas or another similar program, I might suggest working with the large _all_info.csv file found here https://github.com/postera-ai/COVID_moonshot_submissions.

But thanks for the plot, looks interesting.

Best,
Matt

Dear drc007,

It is a good idea to cross-reference the compounds. The sdf available at https://www.cambridgemedchemconsulting.com/news/index_files/334893d80c56ec5ad7e3f243c59f39bc-442.html does not contain the information you claimed. Please can you update the file?

Thank you

Apologies, typo in the link now corrected.