Skip to main navigation Skip to search Skip to main content

Large language models for bioinformatics

  • Wei Ruan
  • , Yanjun Lyu
  • , Jing Zhang
  • , Jiazhang Cai
  • , Peng Shu
  • , Yang Ge
  • , Yao Lu
  • , Shang Gao
  • , Yue Wang
  • , Peilong Wang
  • , Lin Zhao
  • , Tao Wang
  • , Yufang Liu
  • , Luyang Fang
  • , Ziyu Liu
  • , Zhengliang Liu
  • , Yiwei Li
  • , Zihao Wu
  • , Junhao Chen
  • , Hanqi Jiang
  • Yi Pan, Zhenyuan Yang, Jingyuan Chen, Shizhe Liang, Wei Zhang, Terry Ma, Yuan Dou, Jianli Zhang, Xinyu Gong, Qi Gan, Yusong Zou, Zebang Chen, Yuanxin Qian, Shuo Yu, Jin Lu, Kenan Song, Xianqiao Wang, Andrea Sikora, Gang Li, Xiang Li, Quanzheng Li, Yingfeng Wang, Lu Zhang, Yohannes Abate, Lifang He, Wenxuan Zhong, Rongjie Liu, Chao Huang, Wei Liu, Ye Shen, Ping Ma, Hongtu Zhu, Yajun Yan, Dajiang Zhu, Tianming Liu

Research output: Contribution to journalReview articlepeer-review

Abstract

With the rapid advancements in large language model technology and the emergence of bioinformatics-specific language models (BioLMs), there is a growing need for a comprehensive analysis of the current landscape, computational characteristics, and diverse applications. This survey aims to address this need by providing a thorough review of BioLMs, focusing on their evolution, classification, and distinguishing features, alongside a detailed examination of training methodologies, datasets, and evaluation frameworks. We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development, highlighting their impact and transformative potential in bioinformatics. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities. Finally, we highlight emerging trends and future directions, offering valuable insights to guide researchers and clinicians toward advancing BioLMs for increasingly sophisticated biological and clinical applications.

Original languageEnglish (US)
Article numbere70014
JournalQuantitative Biology
Volume14
Issue number1
DOIs
StatePublished - Mar 2026
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Modeling and Simulation
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Computer Science Applications
  • Applied Mathematics

Keywords

  • bioinformatics-specific language models
  • biological systems
  • biomedical AI
  • large language models
  • life active factors

Fingerprint

Dive into the research topics of 'Large language models for bioinformatics'. Together they form a unique fingerprint.

Cite this