An image classification algorithm for football players’ activities using deep neural network

Football (soccer) stands as the world’s most popular sport, enthralling millions of fans globally. Within the dynamic sphere of modern football, gaining comprehensive insights into the nuanced technical and physical demands placed on players is pivotal for performance optimization. This paper introd...

Full description

Saved in:
Bibliographic Details
Published inSoft computing (Berlin, Germany) Vol. 27; no. 24; pp. 19317 - 19337
Main Authors Li, Xingyao, Ullah, Rizwan
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1432-7643
1433-7479
DOI10.1007/s00500-023-09321-3

Cover

More Information
Summary:Football (soccer) stands as the world’s most popular sport, enthralling millions of fans globally. Within the dynamic sphere of modern football, gaining comprehensive insights into the nuanced technical and physical demands placed on players is pivotal for performance optimization. This paper introduces an innovative deep learning-powered image classification algorithm for recognizing football player activities directly from videos and images. Our pioneering approach harnesses the synergistic capabilities of convolutional neural networks (CNNs) and graph convolutional networks (GCNs) to decipher intricate spatial–temporal patterns in player poses and motions. The methodology employs a hybrid CNN and GCN architecture. The CNN leverages consecutive convolutional and pooling layers to automatically extract discriminative visual features from input frames capturing player poses. The GCN models’ skeletal joints as graph nodes with bones as edges, performing graph convolutions to aggregate spatial and temporal information from neighboring body parts. This enables capturing localized pose dynamics. The complementary CNN and GCN outputs are fused through fully connected layers to classify player activities based on both visual appearances and pose configurations. The model is trained end to end on richly annotated football video data using a multi-class cross-entropy loss. Data augmentation and regularization techniques enhance robustness. Extensive experiments validate the proposed architecture's effectiveness, achieving 97.4% accuracy, 96% precision, 95.5% recall, 95.4% F1-score, 0.90 Matthews correlation, and 93% specificity in classifying 17 complex football activities significantly higher than previous benchmarks. Detailed ablation studies confirm the contributions of the CNN, GCN, and fused model components. The work represents a major advance in leveraging deep neural networks for accurate and granular analysis of sports performances.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1432-7643
1433-7479
DOI:10.1007/s00500-023-09321-3