TY - JOUR AU - Jaganathan, P. AU - Karthikeyan, T. PY - 2014 TI - Highly Efficient Architecture for Scalable Focused Crawling Using Incremental Parallel Web Crawler JF - Journal of Computer Science VL - 11 IS - 1 DO - 10.3844/jcssp.2015.120.126 UR - https://thescipub.com/abstract/jcssp.2015.120.126 AB - With the growing industrial impact over the recent years in computer science, data mining has established itself as one of the most important disciplines. In the fast growing Web and in an appropriate amount of time, locating the resources that are precise and relevant is a huge challenge for the all-purpose single process crawlers, which makes the enhanced and the convincing algorithm in demand. Gradually Large scale search engines frequently update their index and in a timely behavior which are not capable to present such information. In this study a scalable focused crawling is proposed with an incremental parallel Web crawler, the Web pages can be crawled concurrently that are relevant to multiple pre-defined topics. Furthermore, to solve the issue of URL distribution, a compound decision model based on multi-objective decision making method is introduced, which will consider multiple factors synthetically such as load balance and relevance, the update frequency issue can be solved by the local repository decision. The result shows that our proposed system will efficiently produce high quality, relevance and freshness with significantly low memory requirement.