Research Article Open Access

Sketching-Din Elimination of Web Page

P. Sivakumar and R. M.S. Parvathi

Abstract

Problem statement: The web content mining used to access lot of web pages, mining of web contents aims to extort positive information or awareness. Approach: There are several type of Web contents which can suggest valuable information to users are accessible in the Web, for instance graphical data, Extensible Markup Language documents, Hyper Text Markup Language documents and simple text. Here, only element of the information is useful for a testing purpose and the remaining information are noises. Results: In this research study, we propose an approach for removing the noises from a given web page which will get better the presentation of web content mining. At first, the web page information is divided into various blocks. Conclusion: From which, the duplicate blocks are removed using sketching. The performance of the proposed approach and results ensure the effectiveness of the proposed approach in classify the main blocks.

Journal of Computer Science
Volume 7 No. 12, 2011, 1888-1893

DOI: https://doi.org/10.3844/jcssp.2011.1888.1893

Submitted On: 5 September 2011 Published On: 22 October 2011

How to Cite: Sivakumar, P. & Parvathi, R. M. (2011). Sketching-Din Elimination of Web Page. Journal of Computer Science, 7(12), 1888-1893. https://doi.org/10.3844/jcssp.2011.1888.1893

  • 2,487 Views
  • 2,102 Downloads
  • 0 Citations

Download

Keywords

  • Web mining
  • web content mining
  • web cleaning
  • duplicate blocks
  • web page information
  • graphical data
  • world wide web
  • Web Structural Mining (WSM)
  • Web Usage Mining (WUM)