Wednesday, February 17, 2010

Google search

Google search is a web search engine owned by Google Inc. and is the most-used search engine on the Web. Google receives several hundred million queries each day through its various services. Google search was originally developed by Larry Page and Sergey Brin in 1997.

Google Search provides more than 22 special features beyond the original word-search capability. These include synonyms, weather forecasts, time zones, stock quotes, maps, earthquake data, movie showtimes, airports, home listings, and sports scores. (see below: Special features). There are special features for numbers including prices, temperatures, money/unit conversions ("10.5 cm in inches"), calculations ( 3*4+sqrt(6)-pi/2 ), package tracking, patents, area codes, and rudimentary language translation of displayed pages.

A Google search-results page is ordered by a priority rank called a "PageRank" which is kept secret to prevent spammers from forcing their pages to the top. Google Search provides many options for customized search (see below: Search options), such as: exclusion ("-xx"), inclusion ("+xx"), alternatives ("xx OR yy"), and wildcard ("x * x").

The search engine

Google's algorithm uses a patented system called PageRank to help rank web pages that match a given search string. The PageRank algorithm computes a recursive score for web pages, based on the weighted sum of the PageRanks of the pages linking to them. The PageRank derives from human-generated links, and is thought to correlate well with human concepts of importance. The exact percentage of the total of web pages that Google indexes is not known, as it is very hard to actually calculate. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. In addition to PageRank, Google also uses other secret criteria for determining the ranking of pages on result lists, reported to be over 200 different indicators.

Search results

Google not only indexes and caches web pages but also takes "snapshots" of other file types, which include PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, and so on. Except in the case of text and SWF files, the cached version is a conversion to (X)HTML, allowing those without the corresponding viewer application to read the file.

Users can customize the search engine, by setting a default language, using the "SafeSearch" filtering technology and set the number of results shown on each page. Google has been criticized for placing long-term cookies on users' machines to store these preferences, a tactic which also enables them to track a user's search terms and retain the data for more than a year. For any query, up to the first 1000 results can be shown with a maximum of 100 displayed per page.

indexable data

Despite its immense index, there is also a considerable amount of data available in online databases which are accessible by means of queries but not by links. This so-called invisible or deep Web is minimally covered by Google and other search engines. The deep Web contains library catalogs, official legislative documents of governments, phone books, and other content which is dynamically prepared to respond to a query.

Privacy in some countries forbids the showing of some links. For instance in Switzerland every private person can force Google Inc. to delete a link, which contains its name.

Google optimization

Since Google is the most popular search engine, many webmasters have become eager to influence their website's Google rankings. An industry of consultants has arisen to help websites increase their rankings on Google and on other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings to draw more searchers to their client's sites.

Search engine optimization encompasses both "on page" factors (like body copy, title elements, H1 heading elements and image alt attribute values) and Off Page Optimization factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page", in particular the title element and the body copy (note: the higher up in the page, presumably the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms.

Google has published guidelines for website owners who would like to raise their rankings when using legitimate optimization consultants.