Research Paper Volume 15, Issue 11 pp 4649—4666

Precious1GPT: multimodal transformer-based transfer learning for aging clock development and feature importance analysis for aging and age-related disease target discovery

Anatoly Urban1, , Denis Sidorenko1, , Diana Zagirova1, , Ekaterina Kozlova1, , Aleksandr Kalashnikov2, , Stefan Pushkov1, , Vladimir Naumov1, , Viktoria Sarkisova1, , Geoffrey Ho Duen Leung1, , Hoi Wing Leung1, , Frank W. Pun1, , Ivan V. Ozerov1, , Alex Aliper1,2, , Feng Ren3, , Alex Zhavoronkov1,2, ,

  • 1 Insilico Medicine, Pak Shek Kok, New Territories, Hong Kong
  • 2 Insilico Medicine, Masdar City, United Arab Emirates
  • 3 Insilico Medicine, Shanghai, China

Received: April 21, 2023       Accepted: May 24, 2023       Published: June 13, 2023
How to Cite

Copyright: © 2023 Urban et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Aging is a complex and multifactorial process that increases the risk of various age-related diseases and there are many aging clocks that can accurately predict chronological age, mortality, and health status. These clocks are disconnected and are rarely fit for therapeutic target discovery. In this study, we propose a novel approach to multimodal aging clock we call Precious1GPT utilizing methylation and transcriptomic data for interpretable age prediction and target discovery developed using a transformer-based model and transfer learning for case-control classification. While the accuracy of the multimodal transformer is lower within each individual data type compared to the state of art specialized aging clocks based on methylation or transcriptomic data separately it may have higher practical utility for target discovery. This method provides the ability to discover novel therapeutic targets that hypothetically may be able to reverse or accelerate biological age providing a pathway for therapeutic drug discovery and validation using the aging clock. In addition, we provide a list of promising targets annotated using the PandaOmics industrial target discovery platform.


AI: Artificial intelligence; COPD: Chronic Obstructive Pulmonary Disease; DNN: Deep Neural Network; DL: Deep Learning; iPSC: induced Pluripotent Stem Cells; ML: Machine Learning; PD: Parkinson’s Disease; SHAP: SHapley Additive exPlanations.