SRE-Ret: An Object Detection Method Based on Sparse Region Extraction
DOI:
https://doi.org/10.37256/cm.6520257227Keywords:
object detection, transformer, attention mechanism, sparse region extraction, small object detection, embedded devicesAbstract
The continuous advancement of image acquisition technology and the subsequent proliferation of high-resolution images have introduced significant challenges to conventional object detection methodologies. While high-resolution feature maps offer a distinct advantage in detecting small objects due to their retention of detailed information, the concomitant increase in candidate regions and computational complexity substantially impedes real-time performance. Conversely, low-resolution feature maps, although computationally efficient, often lack the necessary precision for effective small object detection, failing to satisfy practical application demands. Consequently, optimizing the allocation of computational resources within high-resolution feature maps while preserving the accuracy of small object detection has emerged as a critical focus and ongoing challenge in contemporary research. To address these limitations, this paper introduces an object detection method based on Sparse Region Extraction (SRE), termed SRE-Ret. This method leverages the window-based and shifted-window self-attention mechanisms inherent in the Swin-Transformer architecture. By employing sparse region selection on high-resolution feature layers, it selectively filters feature windows likely to contain objects, thereby substantially reducing the number of candidate regions and redundant computations. Furthermore, a dedicated small object detection head is integrated into the high-resolution feature layers for precise prediction, while an efficient convolutional detection head is utilized on the low-resolution feature layers for rapid inference. The novelty of this approach lies in achieving sparse processing of feature regions via the SRE module, effectively balancing precision and efficiency in multi-scale feature detection.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Yanming Ye, et al.

This work is licensed under a Creative Commons Attribution 4.0 International License.
