A large-scale lychee image parallel classification algorithm based on spark and deep learning

•Improved lychee classification accuracy with a reconstructed T_ECBAM_ResNetS-34 model using PyTorch.•Designed a parallel classification algorithm for large-scale lychee images on the Spark framework.•Validated the effectiveness of the proposed model and algorithm through extensive experiments. Accu...

Full description

Saved in:
Bibliographic Details
Published inComputers and electronics in agriculture Vol. 230; p. 109952
Main Authors Xiao, Yiming, Wang, Jianhua, Xiong, Hongyi, Xiao, Fangjun, Huang, Renhuan, Hong, Licong, Wu, Bofei, Zhou, Jinfeng, Long, Yongbin, Lan, Yubin
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.03.2025
Subjects
Online AccessGet full text
ISSN0168-1699
DOI10.1016/j.compag.2025.109952

Cover

More Information
Summary:•Improved lychee classification accuracy with a reconstructed T_ECBAM_ResNetS-34 model using PyTorch.•Designed a parallel classification algorithm for large-scale lychee images on the Spark framework.•Validated the effectiveness of the proposed model and algorithm through extensive experiments. Accurate and rapid classification of large-scale lychee images is crucial for collecting germplasm resources and studying the characteristics of different lychee varieties, and it requires the construction of accurate classification models and the design of rapid classification algorithms. However, the current deep learning-based classification methods for lychee images are unable to simultaneously meet the processing requirements of accuracy and timeliness in large-scale lychee image classification. To address the problem above, this paper proposes a large-scale parallel classification algorithm for lychee images based on Spark and deep learning. Specifically, first, the T_ECBAM_ResNetS-34 model architecture was designed and trained using a self-built dataset covering ten types of lychee images and the PyTorch deep learning framework, which improved the accuracy of model classification; Second, the model inference algorithm trained by PyTorch was restructured, utilizing Apache Spark RDD and broadcast variables and data structures to implement data partitioning and model parallel computation across nodes. The experimental results show that the method proposed in this paper surpasses existing technologies in both classification accuracy and the speed of large-scale lychee image classification.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0168-1699
DOI:10.1016/j.compag.2025.109952