| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Search Engines

Page history last edited by Charles Forstbauer 14 years, 8 months ago

 

General discussion of search engines

 

good link:  http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SearchEngines.html

 

How do Search Engines Work?

Search Engines for the general web do not really search the World Wide Web directly. Each one searches a database of the full text of web pages automatically havested from the billions of web pages out there residing on servers. When you search the web using a search engine, you are always searching a somewhat stale copy of the real web page. When you click on links provided in a search engine's search results, you retrieve from the server the current version of the page.

Search engine databases are selected and built by computer robot programs called spiders. These "crawl" the web, finding pages for potential inclusion by following the links in the pages they already have in their database (i.e., already "know about"). They cannot think or type a URL or use judgment to "decide" to go look something up and see what's on the web about it. (Computers are getting more sophisticated all the time, but they are still brainless.)

If a web page is never linked to in any other page, search engine spiders cannot find it. The only way a brand new page - one that no other page has ever linked to - can get into a search engine is for its URL to be sent by some human to the search engine companies as a request that the new page be included.

 

Many web pages are excluded from most search engines by policy. The contents of most of the searchable databases mounted on the web, such as library catalogs and article databases, are excluded because search engine spiders cannot access them. All this material is referred to as the "Invisible Web" -- what you don't see in search engine results.

 

Think of it this way: the visable web contains all free stuff that people want you to see: personal webpages, blogs, youtube, tweets, and an enormous amount of crap. The invisable web, for the most part, has information that it took time and money to create and is not given away for free. The old adage about you get what you pay for applies.   

 

What is the "Invisible Web", a.k.a. the "Deep Web"?

The "visible web" is what you can find using general web search engines. It's also what you see in almost all subject directories. The "invisible web" is what you cannot retrieve ("see") using these types of tools.

 

More on the invisable web:  http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html

 

Visable web search engines: http://websearch.about.com/library/searchengine/blsearchenginesatoz.htm

 

An example of a Deep Web searchable database is Iconn:

ICONN is part of the Connecticut Education Network. It provides all students, faculty and residents with online access to essential library and information resources. It is administered by the Connecticut State Library in conjunction with the Department of Higher Education. Through iCONN, a core level of information resources including secured access to licensed databases is available to every citizen in Connecticut. In addition, specialized research information is available to college students and faculty.

http://www.iconn.org/AboutIconn.aspx

 

With out research papers, it is essential to use databases such as Iconn in order to broaden our information horizon, and retrieve primary documents which can not be found through google

Comments (0)

You don't have permission to comment on this page.