Research Article Open Access

Few-shot Fine-tuning of BERT Multilingual for Hindi Word Sense Disambiguation 

Shailendra Kumar Patel1, Rakesh Kumar1 and Anuj Kumar Sirohi2
  • 1 Department of Computer Science, Assam University, Silchar, India
  • 2 Yardi School of Artificial Intelligence, Indian Institute of Technology, Delhi, India

Abstract

Word Sense Disambiguation (WSD) is a fundamental task in Natural Language Processing (NLP), addressing the challenge of identifying correct word meanings in context. This task is particularly complex for morphologically rich and resource-limited languages like Hindi, which exhibit significant lexical ambiguity compounded by limited availability of annotated corpora. To address these challenges, we propose a supervised approach combining the multilingual BERT model (mBERT) with Hindi WordNet as a structured lexical resource. Using few-shot learning, we fine-tune mBERT on a dataset constructed from Hindi WordNet to disambiguate contextually ambiguous words across four parts of speech (POS): nouns, verbs, adjectives, and adverbs. Experiments on standard Hindi WSD benchmarks demonstrate that our method significantly outperforms traditional rule-based and embedding-based approaches, achieving 96.48% accuracy—an approximate 3% improvement over the strongest baseline. These results validate the effectiveness of integrating contextualized embeddings from pre-trained language models with structured lexical databases, highlighting the promise of hybrid techniques for advancing WSD in low-resource languages and providing a framework applicable to other morphologically complex languages with similar resource constraints.

Journal of Computer Science
Volume 21 No. 11, 2025, 2631-2646

DOI: https://doi.org/10.3844/jcssp.2025.2631.2646

Submitted On: 13 February 2025 Published On: 26 December 2025

How to Cite: Patel, S. K., Kumar, R. & Sirohi, A. K. (2025). Few-shot Fine-tuning of BERT Multilingual for Hindi Word Sense Disambiguation . Journal of Computer Science, 21(11), 2631-2646. https://doi.org/10.3844/jcssp.2025.2631.2646

  • 63 Views
  • 7 Downloads
  • 0 Citations

Download

Keywords

  • Word Sense Disambiguation
  • Hindi Natural Language Processing
  • Multilingual BERT
  • Hindi WordNet
  • Low-Resource Languages
  • Few-Shot Learning
  • Contextualized Embeddings