An Entropy Based Method for Removing Web Query Ambiguity in Hindi Language
Abstract
Problem statement: WSD is core problem of many Natural Language Processing (NLP) tasks; information retrieval is one of them. Information Retrieval in Hindi language also faces the similar problem of WSD. Hindi language is spoken by the major population in India. Natives from the rural area come across the setback of Hindi language information retrieval. WSD is one of them. End users do not understand that how the information retrieval system will remove the ambiguity in the queries. An automatic disambiguation system is required to rectify this problem. Various researchers have worked on it and given solutions. But none of them tried to detect the ambiguity in the query before its disambiguation. Approach: We followed entropy based selective query disambiguation approach for Hindi language information retrieval. The approach will identify the ambiguity in the query which will be further disambiguated. The approach is also stimulated by the feature of Google "Did you mean…" for English queries. This study summarizes the ambiguity detection approach as the prior ambiguity detection leads to conserve computation power. Results: We applied the selective query approach on the set of fifty queries. In our query set 35% queries were unambiguous. The survey of results concludes that several times even if the query consists of polysemous word, it is detected as unambiguous. Conclusions/recommendation: The study concludes that the detection of ambiguity is quiet important as it leads to saving computational time. Followed by ambiguity detection, final disambiguation can be done through human intervention based on google feature.
DOI: https://doi.org/10.3844/jcssp.2008.762.767
                                            
                                Copyright: © 2008 S. K. Dwivedi and P. Rastogi. This is an open access article distributed under the terms of the
                                                                            Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                                                                    
- 5,166 Views
- 3,566 Downloads
- 10 Citations
Download
Keywords
- Word sense disambiguation
- information retrieval
- sense ambiguity
- polysemous
- hindi language
- natural language processing
