A lot of attention has been given recently to the likelihood that major search engines censor the information that we view in our browser windows. Much of it has to do with China and other regimes, where political dissent is forbidden …
This author has always been one to condemn censorship in any form. Indy media readers are especially appalled by censorship, and for the most part I am in concurrence with them. A recent personal project has somewhat changed my perspective on the subject, however.
No Other Way
Search engines, especially the ones whose owners are keenly interested in the maintenance of a public profile that is family friendly (that certainly includes all of the major engines), absolutely must censor the results of generalized searches. Failure to do so would soon put a pall over the moniker of the search engine, and it would be relegated to the dark side of the Internet: Triple X, Bogus Pharmaceuticals, and Spam. Most people do not know just how difficult the aforementioned task has become. This author, likewise, had no clues about the difficulty of this task, until a recent project brought it to the fore.
Project: The Author's Home Grown Engine
As computer programming has always been the forte of this author, it seemed that a search engine project should be relatively simple thing to complete. As revealed by the results, the technical challenge of the construction of the search engine presented no significant difficulties, and very soon a hodge-podge of pieces and parts came to be glued together for the purpose. Some parts were created by the author, and some by others. In the end a working home grown search-engine quickly became a reality.
The Toughest Part …
The toughest part, as it came to be, was not how to construct the engine. Instead, the most stringent difficulties related to the reconciliation of the ideology of non-censorship with the idea that the search engine site not be relegated to the "dark side". Search engines operate with several parts. One part constitutes the URL collection mechanism. This mechanism scoops up vast numbers of site URLs, categorizes them, indexes them, and loads them into storage. Another part of the engine extracts specific data from storage, based upon the keywords supplied by a user, and in relation to how well those keywords match the indexed data heap that the engine maintains. Testing of the engine had barely begun when the real challenge presented itself …
Found: A Cesspool!
It was upon analyzing the great scoops of data ripped from the Internet turf by the collection machinery, that the author realized the full extent of the enormous problem that confronts the administrators of the major search engines: the Internet is a stinking heap! There seems to be no way to pussyfoot around this appalling fact. As the great claw-fulls were dumped into the info-containers that we had fashioned, we found that some scoops were worse than others, but many contain more sewage than clay. It was clear that without some form of censorship, no search engine could remain family friendly. All types of rot were found entwined within the raw mash of the Internet, including all of the things that any spam-abused email user would know about. The triple-X solicitations and those ever promising miracle drug offers are sewn into the fabric of the net, and not just within the email in-box.
Most search engines have facilitated a bit of a compromise, it seems. The compromise is simple: the results are always censored for the averages user's keyword searches. The triple-X, the drugs, and the promises to extend the user's promiscuity are all culled from the results unless the engine detects a strong likelihood that the user is searching for such things. I believe this is done by dropping the ranking value of words like "naked". This form of censorship is more a means to make certain material less accessible, and I suppose that the hope is to avoid abject censorship, but at the same time protect the average user from what that user would find to be offensive in nature.
Tight Rope to Walk
The specter of censorship is one we must beware, and ideologically, this author remains a foe of same. However; it is certain that we will live with accidental censorship, that which is due solely to the need to extract wholesome material from the entwined morass that is the Internet today.
This site made manifest by dadaIMC software