This Large Storage application is co-submitted with the Large Compute project SNIC 2020/2-21 (Jan1 - Dec 31 2021), and is a continuation of Large Storage 2020/2-7 (July 1 - Dec 31 2020) and Large Compute 2019/2-21 (Jan 1 - Dec 31 2020) .
Computational chemistry has an important role to fill in the development of new pharmaceuticals. With high performance computer clusters coupled to the latest developments in algorithms and software, we are able to screen vast libraries of compounds searching for new drug candidates, create in silico models of target proteins, and explore protein-protein interactions crucial for e.g. signaling pathways in cancer cells or toxin mechanisms of action.
We aim to proceed with our studies of several targets for cancer therapy, in order to identify small molecule inhibitors: the protein kinases IRE1 and PERK essential for the unfolded protein response and XBP1 mRNA splicing, as well as the XBP1 ligase RtcB crucial for cancer cell survival. We also explore the mechanism and identify possible small molecular binders to APAF-1 that triggers apoptosis, possible inhibition of the pro-caspase8 mimic cFLIP used by cancer cells to avoid apoptotic cascades, the heavily upregulated MTHFD2 playing a key role in cancer cell drug resistance, and the small peptide AGR2 which recent results have shown to be an inducer of tumorogenesis.
We follow well-established protocols for these studies, involving protein preparation (homology modelling if needed), protein-protein docking calculations according to a recent protocol developed in our group, followed by replica MD simulations in the case of exploring signaling pathways. Normally, the MD simulations carried out are of the length 500-1000 ns each, and performed in triplicate, placing high demands for HPC resources. In the drug development projects, we perform systematic docking of ZINC clean drug-like library and similar, refined docking of top ranked ligands, and detailed MD simulations of resulting complexes, followed by additional hit-to-lead optimizations and further computations. The compound libraries we use in our research contain over 1 billion compounds, and requires the use of massively parallel execution. We have furthermore developed an inverse docking protocol to explore selectivity and possible side effects (safety) of obtained compounds. We are currently extending our work into the area of FEP+ guided Machine Learning, in particular in the drug development phase. The size and extent of the simulations, and amount of data processed in the screening campaigns, justifies the resources applied for, whereby we ask to increase the allocation to 400 TiB / 2 000 000 files on Klemming@PDC and 50 TiB / 1 000 000 files on Cephyr@C3SE.