An efficient low-shot class-agnostic counting framework with hybrid encoder and iterative exemplar feature learning

Few-shot learning techniques have enabled the rapid adaptation of a general AI model to various tasks using limited data. In this study, we focus on class-agnostic low-shot object counting, a challenging problem that aims to achieve accurate object counting with only a few annotated samples (few-sho...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 20; no. 6; p. e0322360
Main Authors	Yang, Qinghua, Liu, Bin, Tian, Yan, Shi, Yangming, Du, Xinxin, He, Fangyuan, Guo, Jikun
Format	Journal Article
Language	English
Published	United States Public Library of Science 06.06.2025 Public Library of Science (PLoS)
Subjects	Adaptation Algorithms Artificial intelligence Attention Coders Costs Counting Data visualization Datasets Embedding Humans Image Processing, Computer-Assisted - methods Inference Learning Machine Learning Medical imaging Methods Modules Performance enhancement Real time China
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0322360

Cover

More Information
Summary:	Few-shot learning techniques have enabled the rapid adaptation of a general AI model to various tasks using limited data. In this study, we focus on class-agnostic low-shot object counting, a challenging problem that aims to achieve accurate object counting with only a few annotated samples (few-shot) or even in the absence of any annotated data (zero-shot). In existing methods, the primary focus is often on enhancing performance, while relatively little attention is given to inference time—an equally critical factor in many practical applications. We propose a model that achieves real-time inference without compromising performance. Specifically, we design a multi-scale hybrid encoder to enhance feature representation and optimize computational efficiency. This encoder applies self-attention exclusively to high-level features and cross-scale fusion modules to integrate adjacent features, reducing training costs. Additionally, we introduce a learnable shape embedding and an iterative exemplar feature learning module, that progressively enriches exemplar features with class-level characteristics by learning from similar objects within the image, which are essential for improving subsequent matching performance. Extensive experiments on the FSC147, Val-COCO, Test-COCO, CARPK, and ShanghaiTech datasets demonstrate our model’s effectiveness and generalizability compared to state-of-the-art methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0322360