How Google Wors
By: Mike • Essay • 436 Words • December 23, 2009 • 818 Views
Essay title: How Google Wors
How Google Works
If you aren't interested in learning how Google creates the index and the database of documents that it accesses when processing a query, skip this description. I adapted the following overview from Chris Sherman and Gary Price's wonderful description of How Search Engines Work in Chapter 2 of The Invisible Web (CyberAge Books, 2001).
Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:
• Googlebot, a web crawler that finds and fetches web pages.
• The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
• The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
Let's take a closer look at each part.
Googlebot, Google's Web Crawler
Googlebot is Google's web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It's easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn't traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google's indexer.
Googlebot consists of many computers requesting and fetching pages much