Hybrid Deep Learning Model for Evaluating Subjective Answers Based on Semantic Textual Similarity
- 1 Department of Computer Engineering, Vidyalankar Institute of Technology, Mumbai, India
Abstract
Subjective answer evaluation requires accurately identifying semantic similarities between student and reference responses. This study introduces a Hybrid Deep Learning Model (HDLM) that integrates CNN, GRU, and LSTM architectures to assess Semantic Textual Similarity (STS) more effectively. The HDLM employs two parallel branches-CNN-GRU and CNN-LSTM to capture both local syntactic features and long-range contextual dependencies, followed by Manhattan distance for semantic similarity computation. To address class imbalance and data sparsity, data augmentation and SMOTE resampling techniques are applied. The model is trained using the Quora Question Pairs dataset because of its substantial size and extensive semantic diversity. Comprehensive evaluation demonstrates HDLM’s superior performance (accuracy: 87.80%, F1-score: 0.88, AUC: 0.88) compared to existing models like Siamese LSTM, Multi-head Attention, and SOTA models such as SBERT and Sentence-T5. Statistical significance was validated using Wilcoxon signed-rank tests and 95% confidence intervals. Failure cases and case studies further highlight HDLM’s strengths and shortcomings. Overall, HDLM provides a robust, interpretable, and computationally efficient framework for automated subjective assessment.
DOI: https://doi.org/10.3844/jcssp.2026.1189.1203
Copyright: © 2026 Siddhesh Kudtarkar and Kavita Shirsat. This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 39 Views
- 13 Downloads
- 0 Citations
Download
Keywords
- GRU
- HDLM
- LSTM
- Subjective Answer Evaluation
- Semantic Textual Similarity