1
Department of Biomedical Engineering, Meybod University, Meybod, Iran
2
Meybod University
10.30476/jhmi.2026.109616.1334
Abstract
Background: Video-based assessment of exercise technique can support coaching, injury prevention, and remote rehabilitation, yet many methods stop at action recognition or require laboratory motion capture. Objective: To develop and evaluate a deep-learning framework that classifies execution quality as correct vs. incorrect form from instructional videos. Methods: We compiled 270 YouTube coaching videos spanning ten exercises (~20,500 labeled frames after removing ~15% non-movement content). Clip-level technique labels (correct/incorrect) were assigned based on coaching/biomechanical criteria and propagated to sampled frames. A markerless pose-estimation model (name/version reported) produced 2D keypoints and kinematic descriptors (e.g., joint angles and velocities). These signals were encoded with a ResNet-50 backbone, and a multi-head attention Transformer modeled temporal dependencies. Training used a video-stratified 80/20 split, with 5-fold cross-validation on the training portion only for model selection; class-weighted loss mitigated imbalance, and evaluation reports per-class precision/recall/F1 and confusion matrices. We also define a biomechanical complexity index (0–1) combining joints engaged, angular sensitivity, range of motion, and balance demand to relate movement difficulty to performance. Results: The full CNN–attention–Transformer achieved 97.56% accuracy and F1=0.956 on the held-out test set. Per-exercise performance remained high for low-to-intermediate complexity movements, while the most challenging exercises showed reduced F1—for example, deadlift and crunch exhibited the lowest scores (≈0.88–0.90), yielding an overall F1 range of 0.875–0.945 across exercises. Conclusions: This framework enables multi-exercise, video-based technique assessment (correct vs. incorrect) from consumer footage. Attention-weight patterns and error-case analyses provide practical insight into model decisions for intelligent coaching and tele-rehabilitation.
Rezaee, K. and Monshizadeh, F. (2026). A deep learning framework for evaluating dynamic sports movements via video-based motion analysis for digital health. Health Management & Information Science, (), -. doi: 10.30476/jhmi.2026.109616.1334
MLA
Rezaee, K. , and Monshizadeh, F. . "A deep learning framework for evaluating dynamic sports movements via video-based motion analysis for digital health", Health Management & Information Science, , , 2026, -. doi: 10.30476/jhmi.2026.109616.1334
HARVARD
Rezaee, K., Monshizadeh, F. (2026). 'A deep learning framework for evaluating dynamic sports movements via video-based motion analysis for digital health', Health Management & Information Science, (), pp. -. doi: 10.30476/jhmi.2026.109616.1334
CHICAGO
K. Rezaee and F. Monshizadeh, "A deep learning framework for evaluating dynamic sports movements via video-based motion analysis for digital health," Health Management & Information Science, (2026): -, doi: 10.30476/jhmi.2026.109616.1334
VANCOUVER
Rezaee, K., Monshizadeh, F. A deep learning framework for evaluating dynamic sports movements via video-based motion analysis for digital health. Health Management & Information Science, 2026; (): -. doi: 10.30476/jhmi.2026.109616.1334