Research Paper Advance Articles
A natural language processing–driven map of the aging research landscape
- 1 Universidad Europea de Valencia, Faculty of Health Sciences, Department of Physiotherapy, Nutrition and Sports Sciences, Valencia 46010, España
- 2 Group of Physical Therapy in the Ageing Process: Social and Health Care Strategies, Department of Physical Therapy, Universitat de València, Valencia 46010, Spain
- 3 Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Received: May 27, 2025 Accepted: November 7, 2025 Published: November 25, 2025
https://doi.org/10.18632/aging.206340How to Cite
Copyright: © 2025 Perez-Maletzki and Sanz-Ros. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
Aging research has advanced significantly over the past century, from early studies on animal models to a current emphasis on clinical and translational applications. As research literature expands exponentially, traditional narrative reviews can no longer capture the field’s complexity, highlighting the need for new, unbiased synthesis tools. Here, we leverage advanced natural language processing (NLP) and machine learning (ML) techniques to analyze 461,789 abstracts related to aging published between 1925 and 2023. By integrating Latent Dirichlet Allocation (LDA), term frequency-inverse document frequency (TF-IDF) analysis, dimensionality reduction and clustering, we delineate a comprehensive thematic landscape of aging research. Our results show a clear shift: early decades focused on cellular and molecular mechanisms, while recent years emphasize clinical studies, especially neurodegenerative disorders. Notably, we identify a persistent divide between the biology of aging (BoA) and clinical research, with minimal conceptual overlap between them. Furthermore, we identify distinct clusters representing key biological processes, some of which may have previously been overlooked as cohesive research domains. Finally, we highlight both established and underexplored interconnections that could guide future research. This study outlines shifting priorities and translational gaps in aging research and offers a scalable, data-driven alternative to conventional reviews.