Remedies for Search Bias

Disclosure: I serve as a consultant to various companies that compete with Google. But I write on my own — not at the suggestion or request of any client, without approval or payment from any client.

In a forthcoming paper (update, November 2011: paper is available), I’ll survey the problem of search bias — search engines granting preferred placement and/or terms to their own links or to others’ links chosen for improper purposes. What purposes are improper? Given others’ work in that area, I’ll defer my thoughts on that subject to the paper. Today I’d like to focus on remedies — what tactics a dominant search engine ought not employ due to their detrimental effects on competition, and how prohibiting those tactics would help assure fair competition in search and related businesses.

The prospect of legal or regulatory oversight of search results has attracted skepticism. A search industry news site recently questioned the wisdom of investigating search bias by arguing that, even if bias were uncovered, “it’s not clear what any remedy would be.” James Grimmelmann last month critiqued the suggestion that search engines can be biased, and he argued that even if such bias exists, the legal system cannot usefully prevent it. Discomfort with the prospect of legal intervention extends even to those who ultimately see a need for oversight: For example, Pasquale and Bracha title a recent paper Federal Search Commission?, ending the title with a question mark to credit the immediate shortfalls of an overly bureaucratic approach. Meanwhile, Google’s caricature of regulation warns of government-mandated homogeneous results and unblockable web spam, offering a particularly pronounced view of search regulation as intrusive and undesirable.

I envision an alternative approach for policy intervention in this area — addressing the improprieties that various sites have alleged and stopping specific practices that ought not continue, while avoiding unnecessary restrictions on search engines’ activities.

Experience from Airline Reservation Systems: Avoiding Improper Ranking Factors

A first insight comes from recognizing that regulators have already — successfully! — addressed the problem of bias in information services. One key area of intervention was customer reservation systems (CRS’s), the computer networks that let travel agents see flight availability and pricing for various major airlines. Three decades ago, when CRS’s were largely owned by the various airlines, some airlines favored their own flights. For example, when a travel agent searched for flights through Apollo, a CRS then owned by United Airlines, United flights would come up first — even if other carriers offered lower prices or nonstop service. The Department of Justice intervened, culminating in rules prohibiting any CRS owned by an airline from ordering listings “us[ing] any factors directly or indirectly relating to carrier identity” (14 CFR 255.4). Certainly one could argue that these rules were an undue intrusion: A travel agent was always free to find a different CRS, and further additional searches could have uncovered alternative flights. Yet most travel agents hesitated to switch CRS’s, and extra searches would be both time-consuming and error-prone. Prohibiting biased listings was the better approach.

The same principle applies in the context of web search. On this theory, Google ought not rank results by any metric that distinctively favors Google. I credit that web search considers myriad web sites — far more than the number of airlines, flights, or fares. And I credit that web search considers more attributes of each web page — not just airfare price, transit time, and number of stops. But these differences only grant a search engine more room to innovate. These differences don’t change the underlying reasoning, so compelling in the CRS context, that a system provider must not design its rules to systematically put itself first.

I credit that some metrics might incidentally favor Google even as they are, on their face, neutral. But periodic oversight by a special master (or similar arbiter) could accept allegations of such metrics; both in the US and in Europe, a similar approach oversaw disputes as to what documentation Microsoft made available to those wishing to interoperate with Microsoft software.

Evaluating Manual Ranking Adjustments through Compulsory Disclosures

An alternative approach to avoiding improper ranking factors would require disclosure of all manual adjustments to search results. Whenever Google adjusts individual results, rather than selecting results through algorithmic rules of general applicability, the fact of that adjustment would be reported to a special master or similar authority, along with the affected site, duration, reason, and specific person authorizing the change. The special master would review these notifications and, where warranted, seek further information from relevant staff as well as from affected sites.

Why the concern at ad hoc ranking adjustments? Manual modifications are a particularly clear area for abuse — a natural way for Google to penalize a competitor or critic. Discourage such penalties by increasing their complexity and difficulty for Google, and Google’s use of such penalties would decrease.

I credit that Google would respond to the proposed disclosure requirement by reducing the frequency of manual adjustments. But that’s exactly the point: Results that do not flow from an algorithmic rule of general applicability are, by hypothesis, ad hoc. Where Google elects to use such methods, its market power demands outside review.

Grimmelmann argues that these ad hoc result adjustments are a “distraction.” But if Google’s manual adjustments ultimately prove to be nothing more than penalties to spammers, then regulators will naturally turn their attention elsewhere. Meanwhile, by forcing Google to impose penalties through general algorithms rather than quick manual adjustments, Google will face increased burdens in establishing such penalties — more code required and, crucially, greater likelihood of an email or meeting agenda revealing Google’s genuine intent.

Experience from Browser Choice: Swapping “Integrated” Components

Many complaints about search bias arise when longstanding innovative services are, or appear to be at risk of becoming, subsumed into Google’s own offerings. No ordinary algorithmic link to Mapquest can compete with an oversized multicolor miniature Google Maps display appearing inline within search results. (And, as Consumer Watchdog documented, Mapquest’s traffic dropped sharply when Google deployed inline maps.)

On one hand it is troubling to see established firms disappear in the face of a seemingly-insurmountable Google advantage. The concern is all the greater when Google’s advantage comes not from intrinsic product quality but from bundling and defaults. After all, if Google can use search to push users to its Maps product, Maps will gain market share even if competitors’ services are, on their merits, superior.

Yet it would be untenable to ask Google to disavow new businesses. It is hard to imagine a modern search engine without maps, news, or local search (among other functions largely absent from core search a decade ago). If legal intervention prevented Google from entering these fields, users might lose the useful functions that stem from integration between seemingly-disparate services.

What remedy could offer a fair chance of multiple surviving vendors (with attendant benefits to consumers), while still letting Google offer new vertical search services when it so chooses? E.C. antitrust litigation against Microsoft is squarely on point, requiring Microsoft to display a large choice screen that prompts users to pick a web browser. An initial listing presents the five market-leading options, while seven more are available if a user scrolls. But there is no default; a user must affirmatively choose one of the various options.

Taking the “browser choice” concept to search results, each vertical search service could, in principle, come from a different vendor. If a user prefers that her Google algorithmic search present embedded maps from Mapquest along with local search from Yelp and video search from Hulu, the user could configure browser preferences accordingly. Furthermore, a user could make such choices on a just-in-time basis. (A possible prompt: “We noticed you’re looking for a map, and there are five vendors to choose from. Please choose a logo below.”) Later, an unobtrusive drop-down could allow adjustments. The technical barriers are reasonable: External objects could be integrated through client-side JavaScript — just as so many sites already embed AdSense ads, YouTube player, and other widgets. Or Google and contributors might prefer server-to-server communications of the sort Google uses in its partnerships with AOL and with Yahoo Japan. Either way, technology need not stand in the way.

I credit that many users may be content with most Google services. For example, Google Maps enjoyed instant success through its early offering of draggable maps. But in some areas, Google’s offerings have little traction. Google’s Places service aspires to assess quality of restaurants and local businesses — but Yelp and Angie’s List draw on specialized algorithms, deeper data, and longstanding expertise. So too for TripAdvisor as to hotel reviews, and myriad other sites in their respective sectors. A user might well prefer to get information in these areas from the respective specialized services, not from Google, were the user able to make that choice.

Google often argues that competition is one click away. But here too, the E.C.’s Microsoft litigation is on point. Users had ample ability to install other browsers if they so chose, but that general capability was not enough when the standard operating system made one choice a default. Furthermore, at least Windows let other browsers truly immerse themselves in the operating system — as the default viewer for .HTML files, the default application for hyperlinks in email messages, and so forth. But there is currently no analogue on Google — no way for a user, even one who seeks this function, to combine Google algorithmic search with a competitor’s maps, local results, or other specialized search services.

Banning Other Bad Behaviors: Tying

Using its market power over search, Google sometimes pushes sites to adopt technologies or services Google chooses. Sometimes, Google’s favored implementations may be competitively neutral — simply technical standards Google wants sites to adopt (for example, presenting an index of pages to Google’s crawlers in a particular format). But in other instances, Google uses its power in search to promote adoption of Google’s own services.

I first flagged this tactic as to Google Affiliate Network (GAN), Google’s affiliate marketing service. GAN competes in one of the few areas of Internet advertising where Google is not dominant, and to date Google has struggled to gain traction in this area. However, Google offers remarkable benefits to advertisers who agree to use GAN: GAN advertisers alone enjoy images in their AdWords advertisements on Google.com; their advertisements always appear in the top-right corner above all other right-side advertisements (never further down the page); they receive preferred payment terms (paying only if a user makes a purchase, not merely if a user clicks; paying nothing if a user returns merchandise, a credit card is declined, or a server malfunctions). Moreover, merchants tend to use only a single affiliate network; coordinating multiple networks entails additional complexity and risks paying duplicate commissions on a single purchase. So if Google can convince advertisers to use GAN, advertisers may well abandon competing affiliate platforms.

Google’s tying strategy portends a future where Google can force advertisers and sites to use almost any service Google envisions. Google could condition a top AdWords position not just on a high bid and a relevant listing, but on an advertiser agreeing to use Google Offers or Google Checkout. (Indeed, Checkout advertisers who also used AdWords initially received dramatic discounts on the bundle, and to this day Checkout advertisers enjoy a dramatic multicolor logo adjacent to their AdWords advertisements, a benefit unavailable to any other class of advertiser.) Google would get a major leg up in mobilizing whatever new services it envisions, but Google’s advantage would come at the expense of genuine innovation and competition.

Online Marketing at Big Skinny (teaching materials) with Scott Kominers

Edelman, Benjamin, and Scott Duke Kominers. “Online Marketing at Big Skinny.” Harvard Business School Case 911-033, February 2011. (Revised February 2012.) (educator access at HBP. request a courtesy copy.)

Describes a wallet maker’s application of seven Internet marketing technologies: display ads, algorithmic search, sponsored search, social media, interactive content, online distributors, and A/B testing. Provides concise introductions to the key features of each technology, and asks which forms of online marketing the company should prioritize in the future. Discusses similarities and differences between online and off-line marketing, as well as issues of marketing campaign evaluation.

Supplement:

Online Marketing at Big Skinny — slide supplement – PowerPoint Supplement (HBP 912006)

Teaching Materials:

Online Marketing at Big Skinny – Teaching Note (HBP 911034)

In Accusing Microsoft, Google Doth Protest Too Much

In Accusing Microsoft, Google Doth Protest Too Much. HBR Online. February 3, 2011.

Google this week sparked a media uproar by alleging that Microsoft Bing “copies” Google results. But is that actually the best characterization of what happened? In fact Google’s engineers intentionally clicked bogus listings they had previously inserted into Google’s results, and they did this on computers where they had specifically authorized Microsoft to examine their browsing in order to improve Bing.

Strikingly, Google’s own Matt Cutts previously endorsed the use of Toolbar and similar data to improve search results — calling this approach “a good idea.” And Google’s own Toolbar Privacy Policy allows Google to perform the same analysis Bing used. So I don’t have much sympathy for Google’s allegations of impropriety. Quite the contrary: With Bing’s small market share, this data is important in improving Bing search results and building a viable competitor to Google’s dominant search offering.

Details, including what exactly happened, Google’s prior statements, and Google’s widespread use of others’ intellectual property:

In Accusing Microsoft, Google Doth Protest Too Much

Measuring Bias in “Organic” Web Search with Benjamin Lockwood

By comparing results between leading search engines, we identify patterns in their algorithmic search listings. We find that each search engine favors its own services in that each search engine links to its own services more often than other search engines do so. But some search engines promote their own services significantly more than others. We examine patterns in these differences, and we flag keywords where the problem is particularly widespread.

Even excluding “rich results” (whereby search engines feature their own images, videos, maps, etc.), we find that Google’s algorithmic search results link to Google’s own services more than three times as often as other search engines link to Google’s services.

For selected keywords, biased results advance search engines’ interests at users’ expense: We demonstrate that lower-ranked listings for other sites sometimes manage to obtain more clicks than Google and Yahoo’s own-site listings, even when Google and Yahoo put their own links first.

Details, including methodology, analysis, and policy implications:

Measuring Bias in “Organic” Web Search

Bias in Search Results?: Diagnosis and Response

Edelman, Benjamin. “Bias in Search Results?: Diagnosis and Response.” Indian Journal of Law and Technology 7 (2011): 16-32.

I explore allegations of search engine bias, including understanding a search engine’s incentives to bias results, identifying possible forms of bias, and evaluating methods of verifying whether bias in fact occurs. I then consider possible legal and policy responses, and I assess search engines’ likely defenses. I conclude that regulatory intervention is justified in light of the importance of search engines in referring users to all manner of other sites, and in light of striking market concentration among search engines.

Adverse Selection in Online ‘Trust’ Certifications and Search Results

Edelman, Benjamin. “Adverse Selection in Online ‘Trust’ Certifications and Search Results.” Electronic Commerce Research and Applications 10, no. 1 (January-February 2011): 17-25.

Widely used online “trust” authorities issue certifications without substantial verification of recipients’ actual trustworthiness. This lax approach gives rise to adverse selection: the sites that seek and obtain trust certifications are actually less trustworthy than others. Using an original dataset on web site safety, I demonstrate that sites certified by the best-known authority, TRUSTe, are more than twice as likely to be untrustworthy as uncertified sites. This difference remains statistically and economically significant when restricted to “complex” commercial sites. Meanwhile, search engines create an implied endorsement in their selection of ads for display, but I show that search engine advertisements tend to be less safe than the corresponding organic listings.

Knowing Certain Trademark Ads Were Confusing, Google Sold Them Anyway — for $100+ Million

Disclosure: I serve as a consultant to various companies that compete with Google. But I write on my own — not at the suggestion or request of any client, without approval or payment from any client.

When a user enters a search term that matches a company’s trademark, Google often shows results for the company’s competitors. To take a specific example: Searches for language software seller "Rosetta Stone" often yield links to competing sites — sometimes, sites that sell counterfeit software. Rosetta Stone think that’s rotten, and, as I’ve previously written, I agree: It’s a pure power-play, effectively compelling advertisers to pay Google if they want to reach users already trying to reach their sites; otherwise, Google will link to competitors instead. Furthermore, Google is reaping where others have sown: After an advertiser builds a brand (often by advertising in other media), Google lets competitors skim off that traffic — reducing the advertiser’s incentive to invest in the first place. So Google’s approach to trademarks definitely harms advertisers and trademark-holders. But it’s also confusing to consumers. How do we know? Because Google’s own documents admit as much.

Today Public Citizen posted an unredacted version of Rosetta Stone’s appellate brief in its ongoing litigation with Google. Google had sought to keep confidential the documents that ground district court and appellate adjudication of the dispute, but now some of the documents are available — giving an inside look at Google’s policies and objectives for trademark-triggered ads. Some highlights:

  • Through early 2004, Google let trademark holders request that ads be disabled if they used a trademark in keyword or ad text. But in early 2004, Google determined that it could achieve a "significant potential revenue impact" from selling trademarks as keywords. (ref)
  • In connection with Google’s 2004 policy change letting advertisers buy trademarks as keywords, Google conducted experiments to assess user confusion from trademarks appearing in search advertisements. Google concluded that showing a trademark anywhere in the text of an advertisement resulted in a "high" degree of consumer confusion. Google’s study concluded: "Overall very high rate of trademark confusion (30-40% on average per user) … 94% of users were confused at least once during the study." (ref)
  • Notwithstanding Google’s 2004 study, Google in 2009 changed its trademark policy to permit the user of trademarks in advertisement text. Google estimated that this policy change would result in at least $100 million of additional annual revenue, and potentially more than a billion dollars of additional annual revenue. Google implemented this change without any further studies or experiments as to consumer confusion. (ref)
  • Google possesses more than 100,000 pages of complaints from trademark holders, including at least 9,862 complaints from at least 5,024 trademark owners from 2004 to 2009. (ref)

Kudos to Public Citizen for obtaining these documents. That said, I believe Google should never have sought to limit distribution of these documents in the first place. In other litigation, I’ve found that Google’s standard practice is to attempt to seal all documents, even where applicable court rules require that documents be provided to the general public. That’s troubling, and that needs to change.

Hard-Coding Bias in Google “Algorithmic” Search Results

I present categories of searches for which available evidence indicates Google has “hard-coded” its own links to appear at the top of algorithmic search results, and I offer a methodology for detecting certain kinds of tampering by comparing Google results for similar searches. I compare Google’s hard-coded results with Google’s public statements and promises, including a dozen denials but at least one admission. I conclude by analyzing the impact of Google’s tampering on users and competition, and by proposing principles to block Google’s bias.

Details, including screenshots, methodology, proposed regulatory response, and analogues in other industries:

Hard-Coding Bias in Google “Algorithmic” Search Results

A Closer Look at Google’s Advertisement Labels

Google's tiny 'Ads' labelGoogle’s tiny ‘Ads’ label

The FTC has called for “clear and conspicuous disclosures” in advertisement labels at search engines, and the FTC specifically emphasized the need for “terms and a format that are easy for consumers to understand.” Unfortunately, Google’s new advertisement labels fail this test: Google’s “Ads” label is the smallest text on the page, far too easily overlooked. (Indeed, as I show in the image at left, the “Ads” label substantially fits within an “o” in “Google.”) Meanwhile, Google now merges algorithmic and advertisement results merged within a single set of listings; Google’s “Help” explanations are inaccurate; and Google uses inconsistent labels mere inches apart within search results, as well as across services.

Details, including the shortfalls, screenshots, comparisons, and proposed alternatives:

A Closer Look at Google’s Advertisement Labels

.