Empirical Analysis of Google SafeSearch

Google offers interested users a version of its search engine restricted by a service it calls SafeSearch, intended to omit references to sites with “pornography and explicit sexual content.” However, testing indicates that SafeSearch blocks at least tens of thousands of web pages without any sexually-explicit content, whether graphical or textual. Blocked results include sites operated by educational institutions, non-profits, news media, and national and local governments. Among searches on sensitive topics such as reproductive health, SafeSearch blocks results in a way that seems essentially random; it is difficult to construct a rational non-arbitrary basis for which pages are allowed and which are omitted. Full article.

Web Sites Sharing IP Addresses: Prevalence and Significance

Web Sites Sharing IP Addresses: Prevalence and Significance. (September 2013)

More than 87% of active domain names are found to share their IP addresses (i.e. their web servers) with one or more additional domains, and more than two third of active domain names share their addresses with fifty or more additional domains. While this IP sharing is typically transparent to ordinary users, it causes complications for those who seek to filter the Internet, restrict users’ ability to access certain controversial content on the basis of the IP address used to host that content. With so many sites sharing IP addresses, IP-based filtering efforts are bound to produce “overblocking” — accidental and often unanticipated denial of access to web sites that abide by the stated filtering rules.

Large-Scale Registration of Domains with Typographical Errors

Large-Scale Registration of Domains with Typographical Errors. (January 2003)

The author reports more than eight thousand domains that consist of minor variations on the addresses of well-known web sites, reflecting typographical errors often made by Internet users manually typing these addresses into their web browsers. Although the majority of these domain names are variations of sites frequently used by children, and although their domain names do not suggest the presence of sexually-explicit content, more than 90% offer extensive sexually-explicit content. In addition, these domains are presented in a way that temporarily disables a browser’s Back and Exit commands, preventing users from exiting easily. Most or all of the domains are registered to an individual previously enjoined by the FTC from operating domains that are typographic variations on famous names, and these domains remain operational subsequent to an injunction ordering their suspension.