Normalized Web Distance Based Web Query Classification
S. Lovelyn Rose and K. R. Chandran
DOI : 10.3844/jcssp.2012.804.808
Journal of Computer Science
Volume 8, Issue 5
Problem statement: The problem is to classify a given web query to a set of 67 target categories. The target categories are ranked based on the degree of similarity to a given query. Approach: The feature set is the set of intermediate categories retrieved from a directory search engine for a given query. Using direct mapping and Normalized Web Distance (NWD) the intermediate categories are mapped to the required target categories. The categories are then ranked based on three parameters of the intermediate categories namely, position, frequency and a combination of frequency and position. Results: The results proved that the third parameter gave a better result and a maximum of 40 search result pages ensure better results. Conclusion: With NWD as the similarity measure, the precision and recall is found to increase by 10% over the previous methods.
© 2012 S. Lovelyn Rose and K. R. Chandran. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.