Thursday, 9 June 2011
CASE 310 - The deep web
The deep web is usually defined as the content on the Web not accessible through a search on general search engines. This content is sometimes also referred to as the hidden or invisible web, but there is a lot of shady shit on there as well, I personally wouldn't ever need o go on there and run the risk of stepping into the wrong room, some say its all just full of science, but peodo's, terrorists and some very strange people dive in these waters.
The Web is a complex entity that contains information from a variety of source types and includes an evolving mix of different file types and media. It is much more than static, self-contained Web pages. In fact, the part of the Web that is not static, and is served dynamically "on the fly," is far larger than the static documents that many associate with the Web.
The concept of the deep Web is becoming more complex as search engines have found ways to integrate deep Web content into their central search function. This includes everything from airline flights to news to stock quotations to addresses to activities on Facebook accounts.
Content on the deep Web
When we refer to the deep Web, we are usually talking about the following:
The content of databases. Databases contain information stored in tables created by such programs as Access, Oracle, SQL Server, and MySQL. (There are other types of databases, but we will focus on database tables for the sake of simplicity.) Information stored in databases is accessible only by query. In other words, the database must somehow be searched and the data retrieved and then displayed on a Web page. This is distinct from static, self-contained Web pages, which can be accessed directly. A significant amount of valuable information on the Web is generated from databases.
Non-text files such as multimedia, images, software, and documents in formats such as Portable Document Format (PDF) and Microsoft Word. For example, see Digital Image Resources on the Deep Web for a good indication of what is out there for images.
Content available on sites protected by passwords or other restrictions. Some of this is fee-based content, such as subscription content paid for by libraries or private companies and available to their users based on various authentication schemes.
Special content not presented as Web pages, such as full text articles and books
Dynamically-changing, updated content, such as news and airline flights.This is usually the basic,"traditional" list. In these days of the social Web, let's consider adding new content to our list of deep Web sources. For example:
Discussions and other communicative activities on social networking sites, for example Facebook
Bookmarks and citations stored on social bookmarking sites
As you can see, based on these few examples, the deep Web is expanding.