QLViT: ALightweight Cell ClassificationMethod forMicroscope Images Based on MViTv2 and Linear Attention

Authors

  • Panpan Wu College of Computer and Information Engineering, Tianjin Normal University, Tianjin, China https://orcid.org/0000-0003-2915-2086
  • Zhangda Liu College of Computer and Information Engineering, Tianjin Normal University, Tianjin, China
  • Ziping Zhao College of Computer and Information Engineering, Tianjin Normal University, Tianjin, China https://orcid.org/0000-0002-8719-6389
  • Rui Guo School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
  • Hengyong Yu Department of Electrical and Computer Engineering, University of Massachusetts Lowell, Lowell, USA

DOI:

https://doi.org/10.37256/cm.7120267713

Keywords:

cell classification, linear attention, quantitative methodology, kolmogorov-arnold network

Abstract

Accurate cell classification plays a vital role in the diagnosis and treatment of diseases. However, existing methods face challenges such as limited feature learning and excessive computational complexity, resulting in low classification accuracy, prolonged training processes, and slow inference speeds. We propose a novel lightweight method, Quantized Linear Vision Transformer (QLViT), based on the Multiscale Vision Transformers (MViTv2) and linear attention mechanisms, to facilitate cell classification tasks from microscope images. Specifically, QLViT employs a large-kernel convolutional layer and a well-designed feature extraction module called Conv-Linear Attention (CLA) to extract features. It optimizes self-attention with an activation function and utilizes a residual structure to facilitate feature reuse and address gradient issues. The CLA ensures efficient learning of local information via dynamic convolution and employs linear attention to comprehensively capture global features, maintaining a lightweight profile compared to the traditional self-attention. By introducing the Kolmogorov-Arnold Network (KAN) structure, CLA significantly reduces computational complexity and parameter count. Extensive experiments on four public datasets demonstrate the effectiveness of QLViT. We achieve an accuracy of 97.19% on the BioMediTech dataset, 97.35% on the ICPR-HEp-2 dataset, 90.45% on the blood malignancy bone marrow cytology expert-annotated dataset for a six-category classification task, and an impressive accuracy of 99.84% on the white blood cell dataset. Furthermore, our method exhibits a computational efficiency of 1.95 Giga Floating-point Operations (GFLOPs) and utilizes 9.07 million parameters. Our results show that QLViT outperforms current state-of-the-art methods across multiple datasets, demonstrating its superior inference speed, lightweight design, strong feature extraction capabilities and generalizability. This proposed method provides a promising solution in the field of medical image classification.

Downloads

Published

2026-01-06

How to Cite

1.
Wu P, Liu Z, Zhao Z, Guo R, Yu H. QLViT: ALightweight Cell ClassificationMethod forMicroscope Images Based on MViTv2 and Linear Attention. Contemp. Math. [Internet]. 2026 Jan. 6 [cited 2026 Jan. 8];7(1):593-612. Available from: https://ojs.wiserpub.com/index.php/CM/article/view/7713