Abstract
With the rapid advancements in large language model technology and the emergence of bioinformatics-specific language models (BioLMs), there is a growing need for a comprehensive analysis of the current landscape, computational characteristics, and diverse applications. This survey aims to address this need by providing a thorough review of BioLMs, focusing on their evolution, classification, and distinguishing features, alongside a detailed examination of training methodologies, datasets, and evaluation frameworks. We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development, highlighting their impact and transformative potential in bioinformatics. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities. Finally, we highlight emerging trends and future directions, offering valuable insights to guide researchers and clinicians toward advancing BioLMs for increasingly sophisticated biological and clinical applications.
| Original language | English (US) |
|---|---|
| Article number | e70014 |
| Journal | Quantitative Biology |
| Volume | 14 |
| Issue number | 1 |
| DOIs | |
| State | Published - Mar 2026 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Modeling and Simulation
- Biochemistry, Genetics and Molecular Biology (miscellaneous)
- Computer Science Applications
- Applied Mathematics
Keywords
- bioinformatics-specific language models
- biological systems
- biomedical AI
- large language models
- life active factors
Fingerprint
Dive into the research topics of 'Large language models for bioinformatics'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver