-------------------------------------------------------------------
A Research Paper
Presented to
The Faculty of the Discipline of Computer Science
University of the Philippines Baguio
In Partial Fulfillment
Of the Requirements of CMSC 199
Undergraduate Seminar by Reginald S. Casela
Eloise Dorothy V. Gamit
Faculty Adviser
Cmsc 199: Undergraduate seminar
December 16, 2013
Date Submitted
Table of Contents
Introduction …………………………….…………………………..…………………….. 1
Topic 1: Search Engines and How they Work…………………………………………...… 1
Topic 1.1: Crawling…...……………………………………………………...… 2
Topic 1.2: Indexing……………..………..…………………...……………………… 2 Topic 1.3: Searching……………………………………………………………………… 3
Topic 2: The Deep Web……………………………………………………………………….. 4
Topic 2.1: How Do You Connect to the Deep Web………………………..................... 4
Topic 2.2: What’s Inside the Deep Web…………..……………………………...……. 5
Topic 2.2.1: The Silk Road……………….……………………………………. 5 Topic 2.2.2: Hire a Hitman Sites………………………………………………. 6 Topic 2.2.3: The Hidden Wiki…………………………………………………. 7
Conclusion……………………………………….……………………...……………………… 8
Areas for future study / Recommendation……….……………………………………………... 8
Introduction
Since 1991, the start of the World Wide Web, there has been a rapid increase of numbers in websites in the Internet and according to Netcraft in November 2013, the site had an increase of 18 million more responses compared to the 785,293,473 responses that they got last October 2013 [1]. There’s also a study on 2005 saying that there are more than 11.5 billion indexed pages [2]. Two sources for tracking the growth of the Web are http://searchengineshowdown.com/stats/ and http://searchenginewatch.com/article.php/2156481 and even though they’re not updated on a regular basis. Estimating the size of the whole Web is not an easy task due to its dynamic nature. Nevertheless, it is possible to assess the size of the publically indexable Web. The indexable Web [3] is
Citations: [1] “November 2013 Web Server Survey.” Internet: http://news.netcraft.com/archives/category/web-server-survey/, November 1, 2013 [November 25, 2013]. [2] Gulli, Signorini “The Indexable Web is more than 11.5 billion pages.” Internet: http://homepage.cs.uiowa.edu/~asignori/papers/the-indexable-web-is-more-than-11.5-billion-pages/, October 21, 2011 [November 25, 2013] [3] E [6] Manning, Raghavan, Schutze “Introduction to Information Retrieval”, Cambridge University Press. 2008 [7] “Mining the Deep Web” Internet: http://www.learnthenet.com/how-to/search-the-deep-web/, 2010 [December 2, 2013] [8] “Inside the Deep Web: My Journey through the New Underground” Internet: http://thenewsjunkie.com/inside-the-deep-web-my-journey-through-the-new-underground/, 2013 [December 2, 2013]. [10] Sick Chripse. “Ultimate Guide to the Deep Web” Internet: “http://www.sickchirpse.com/deep-web-guide/”, September 10, 2013 [December 16, 2013]. [11] “Understanding the Deep Web in 10 Minutes” Internet: “http://bigdata.brightplanet.com/Portals/179268/docs/UnderstandingTheDeepWeb_20130311.pdf” March 2013 [December 16, 2013].