ChatGPT will bring revolution to knowledge management and insight generation. Are we ready for this?

June 28, 2023

I have never done a cyclic peptide design in my life. So, when I was approached with a question on how to optimize the permeability of one, I guessed that more lipophilicity would help, and likely there are some specific neutral "vectors" that will also help. But the usual step is to go thru several papers and get the answer, obvious, right? I started with a very good article: "Optimizing PK properties of cyclic peptides: the effect of side chain substitutions on permeability and clearance". Read it, made conclusions and decided to feed it to ChatGPT (3.5) with request to "please extract very concise medicinal chemistry insights that can be further reused for peptide optimization". Here is what I got and and that was very neat:

1. Modification of peptide side chains significantly alters their log D values and subsequently their in vitro properties. For instance, aromatic or polar aliphatic side chains didn't significantly reduce log D, whereas ionizable aspartic acid and lysine side chains did, but at the cost of permeability.

2. Non-natural N-methylated amino acids with heterocyclic side chains demonstrated favorable log D's and low clearance.

3. There's a correlation between log D and permeability, as well as log D and clearance. A log D value around three seems optimal for both permeability and clearance.

4. The oral bioavailability (F) of peptides can be significantly affected by their polarity and permeability. For example, peptide 2 had a significantly lower F than peptide 1 due to its greater polarity and lower permeability.

5. Some peptides, despite having low clearance, still demonstrated low oral bioavailability, suggesting that poor absorptive permeability can limit oral absorption.

6. Oral absorption of peptides might be improved through solubilizing formulations.

7. The introduction of polar groups into the peptide scaffold impacts structural diversity and property space, which is essential for the design of cyclic peptide-based therapeutics.

8. Substitution of certain amino acids could drastically affect the cell permeability and oral bioavailability, e.g., substitution of Leu with Ser significantly reduced both.

With the era of LLMs information processing changed forever. Large language models (LLMs) have a massive potential for knowledge management and insight generation. We cannot process the amount of data we are generating. We need to act on data fast. And adoption of them by pharmaceutical and biotech industries is slow because of the challenges that just became obvious: the cost of supporting those models is huge, the expertise gap is there, and data quality still suboptimal. Most common tasks are:

Identifying unusual patterns: "Company XYZ started hiring senior PK/PD people for therapeutic area V, which might impact project T. Consider increase budgeting for project T in preclinical stages."
Connecting A and B: "Phenotypic project X has similar hits as phenotypic project Y, likely they are connected via pathway Z and downregulating target A, B, and C. Consider merging assets of those projects together."
Summarizing knowledge into a machine-readable and actionable form that, in turn, will help generative models to improve: "Generation of the new set of compounds for hit series of project V finished. We have used insights from several references that indicate that the efficiency of a hit will likely increase if we increase logD by one unit and keep HBD between 3 and 5 for this therapeutic area. Those compounds are prone to Nav1.5-related cardiotoxicity, as indicated by publications [12-19] and confirmed by internal models.

PS: feeding the link didn't work in ChatGPT even with plugins, so I removed experimental part and tables.

Search This Blog

High-Dimensional Pharmacology

ChatGPT will bring revolution to knowledge management and insight generation. Are we ready for this?

Comments

Post a Comment

Popular posts from this blog

Paper comment: "MELLODDY: cross pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information"

Can we "talk" with the data? A tiny case for testing pandas_ai for human clearance data