This model is a fine-tuned version of google/gemma-2-9b-it (hugging-face card), fused with LoRA adapter for the context-based protein interaction retrieval task. It was developed as part of our study aimed at creating a hybrid approach that combines text-mining methods with graph neural networks and fine-tuned large language models. This approach expands biomedical knowledge graphs and interprets the predicted edges based on the context of literature sources. It can be used both as part of a pipeline, where the previous step involves predicting connections based on a graph neural network, or independently as a standalone unit.

Input:
The model was fine-tuned to use the following input format:
[INST]Context: [PMID: Number Abstract text] Question: Based on the provided context, do the terms [Protein 1 label] and [Protein 2 label] interact with each other? NOTE: Always begin your response with 'YES' or 'NO', indicating if the two terms are interacting in a similar way to the interacting or non-interacting examples, along with a confidence level (e.g., low, medium, high). If two names belong to the same entity, do not consider them as interacting. Examples: Interacting: "The combination of MLN4924 and imatinib "triggered a dramatic shift in the expression of MCL1 and NOXA" and that the combination "enhanced" the activation of an intra S-phase checkpoint, which indicates that the two drugs interact to produce a combined effect on the leukemia cells that is different from the effect of either drug alone." [YES, high confidence] Explanation: ...; Non-interacting: "While both drugs are used to induce the accumulation of hypoxia-inducible factor 1alpha and subsequently EBV reactivation, there is no direct interaction mentioned between pevonedistat and Desferal. Instead, they are presented as separate agents that stabilize HIF-1alpha and have effects on the induction of EBV reactivation." [NO, high confidence] Explanation: ...[/INST]

Example:
[INST]Context: [PMID: 9177227 Induction by leptin of uncoupling protein-2 and enzymes of fatty acid oxidation. We have studied mechanisms by which leptin overexpression, which reduces body weight via anorexic and thermogenic actions, induces triglyceride depletion in adipocytes and nonadipocytes. Here we show that leptin alters in pancreatic islets the mRNA of the genes encoding enzymes of free fatty acid metabolism and uncoupling protein-2 (UCP-2). In animals infused with a recombinant adenovirus containing the leptin cDNA, the levels of mRNAs encoding enzymes of mitochondrial and peroxisomal oxidation rose 2- to 3-fold, whereas mRNA encoding an enzyme of esterification declined in islets from hyperleptinemic rats. Islet UCP-2 mRNA rose 6-fold. All in vivo changes occurred in vitro in normal islets cultured with recombinant leptin, indicating direct extraneural effects. Leptin overexpression increased UCP-2 mRNA by more than 10-fold in epididymal, retroperitoneal, and subcutaneous fat tissue of normal, but not of leptin-receptor-defective obese rats. By directly regulating the expression of enzymes of free fatty acid metabolism and of UCP-2, leptin controls intracellular triglyceride content of certain nonadipocytes, as well as adipocytes.] Question: Based on the provided context, do the terms [leptin protein] and [uncoupling protein-2 protein] interact with each other? NOTE: Always begin your response with 'YES' or 'NO', indicating if the two terms in the are interacting in a similar way to the interacting or non-interacting examples, along with a confidence level (e.g., low, medium, high). If two names belong to the same entity, do not consider them as interacting. Examples: Interacting: "The combination of MLN4924 and imatinib "triggered a dramatic shift in the expression of MCL1 and NOXA" and that the combination "enhanced" the activation of an intra S-phase checkpoint, which indicates that the two drugs interact to produce a combined effect on the leukemia cells that is different from the effect of either drug alone." [YES, high confidence] Explanation: ...; Non-interacting: "While both drugs are used to induce the accumulation of hypoxia-inducible factor 1alpha and subsequently EBV reactivation, there is no direct interaction mentioned between pevonedistat and Desferal. Instead, they are presented as separate agents that stabilize HIF-1alpha and have effects on the induction of EBV reactivation." [NO, high confidence] Explanation: ...[/INST]

Output:
The output is a text that always begins with either YES or NO, followed by a model's level of confidence and description of the identified interaction or the absence of one, based on the input context, which always begins with Explanation:.
Example:
YES, high confidence; Explanation: The context provided indicates that leptin directly regulates the expression of uncoupling protein-2 (UCP-2) in various tissues, including pancreatic islets and fat tissue. Specifically, it mentions that leptin overexpression increases UCP-2 mRNA levels by more than 10-fold in epididymal, retroperitoneal, and subcutaneous fat tissue of normal rats. This suggests that leptin and UCP-2 interact in a regulatory manner, where leptin influences the expression of UCP-2.

Citation

If you use this model in your research, please cite our paper:

Ivanisenko, T.V.; Demenkov, P.S.; Ivanisenko, V.A. An Accurate and Efficient Approach to Knowledge Extraction from Scientific Publica-tions Using Structured Ontology Models, Graph Neural Networks, and Large Language Models. Int. J. Mol. Sci. 2024, 24, 25, 11811. https://doi.org/10.3390/ijms252111811

Acknowledgments

This work was supported by a grant for research centers, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730324P540002) and the agreement with the Novosibirsk State University dated December 27, 2023 No. 70-2023-001318.

Downloads last month
36
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Timofey/Gemma-2-9b-it-Fused_PPI

Base model

google/gemma-2-9b
Finetuned
(85)
this model

Dataset used to train Timofey/Gemma-2-9b-it-Fused_PPI