Google’s Snoops: Mining Our Data for Profit and Pleasure

Twenty-four hours a day, across more than sixty free product “platforms,” Google is storing, indexing, and cross-referencing information about the activities of a billion people. What are the 30,000 prodigies at Google, Inc. doing with all that data?

At the Google headquarters in Mountain View, CA (Denis Capellin/Flickr)

There was a running gag among fellow workers where we would walk by each other and whisper “Don’t be evil, pft!” and roll our eyes.
–Former Google employee

Google’s publicists have been working extra hard this year. Edward Snowden’s revelations have made the company look like a pawn of the NSA; Google Glass has been drawing the ire of privacy advocates around the world; and in two separate lawsuits, both of which are now moving forward over Google’s strong objections, the company has been accused of wiretapping. The first suit is testing the legality of the company’s practice (now supposedly discontinued) of collecting information from private Wi-Fi networks using its Street View vehicles, and the second is challenging its ongoing practice of analyzing the contents of all emails sent and received by Gmail users. The company defends its practice of scanning emails as a means of gathering information it uses to send people targeted ads.

Although wide-ranging, the revelations and lawsuits are overlooking an important aspect of Google’s activities that is especially worrisome—the human element. It’s not just Google’s computers that have access to those emails; employees do too, and that introduces troubling possibilities that go far beyond the mundane world of targeted advertising. Edward Snowden and Bradley Manning were just low-level workers when they got ideas. What kinds of crazy notions might be popping into the heads of the 30,000 prodigies at Google, Inc.?

I am singling out Google here, as opposed to other high-tech giants like Facebook, or, for that matter, the NSA itself, because Google has a unique business model that gives it unfettered access to and control over immensely rich individual “profiles” of information on a scale that is unprecedented. The model is exquisitely simple and sublimely deceptive: We give you free services that you are likely to use dozens of times a day while we invisibly track and record everything you do. Based on what we have learned about you, we then charge advertisers premium fees to reach exactly the right buyers for their products and services.

“Google’s dance,” as I have called it in previous writings, lies in masking the company’s business model behind the endless array of free services. On the surface, Google appears primarily to be an information provider, but it is actually a glorified advertising firm, with 97 percent of its revenues coming from advertisers. Users see only the surface, which they love, but would they be so amorous if they were more aware of what the surface was for?

Twenty-four hours a day, across more than sixty free product “platforms” such as Gmail, the Google search engine, YouTube, Google Plus, Google Maps, Street View, and Google Wallet, the company is storing, indexing, and cross-referencing information about the beliefs, tastes, and activities of a billion people—including you—and not just when you are online. If you have the Android system on your mobile device, Google can track you even if you are innocently reading your ePub version of The Motorcycle Diaries. If you use Google Voice, your calls are transcribed, analyzed, indexed, and added to your profile, just as if they were Gmails. And if, in the near future, you find yourself within spitting distance of someone wearing Google Glass, beware: what you do and say can be recorded. Think J. Edgar Hoover multiplied by, well, a google (that’s a 1 with 100 zeros after it). No other company aggregates so much individual data so aggressively, competently, and invisibly.

Just how much data does Google actually have about you? If you are active in the digital world, it probably has the equivalent of filing cabinets full—but what, exactly, is in those drawers? Google won’t say. The content is private, and Google’s privacy, unlike yours, is sacrosanct.

Google appears primarily to be an information provider, but it is actually a glorified advertising firm, with 97 percent of its revenues coming from advertisers.

Once you unknowingly give them information about yourself—even search terms you never typed fully, or draft emails you never actually sent—it is their property, which, according to their Terms of Service Agreement (to which you automatically assent when you use any Google product, even if you don’t know you’re using a Google product), they can share that information, at their discretion, with “those we work with”—or just about any agency or individual if the company has a “good faith” belief that doing so is required by law or will “protect . . . the rights, property or safety of Google.”

But not with you. Unless regulation requires it—and so far, none does—the company will never share information about you with you. Needless to say, you also have no way to spot or remove incorrect information that may have found its way into your file drawers. Many Google employees, on the other hand—especially the whiz kids on the data mining teams—can look at your personal detritus all they want.

What could possibly go wrong?

Let’s think about this in general terms, or at least in psychological terms (that’s my field). How would people behave who had easy access to the private emails of hundreds of millions of people, including those written by their ex-spouses, childhood crushes, favorite celebrities, and least favorite politicians? Would they peek now and then? And if they found some juicy tidbits, would they sometimes joke about them with office mates? If they found some serious dirt on the jerk who bullied them in high school or the blowhard right-wing congressman who is ruining America, would they be tempted to leak some info to the Guardian or the FBI?

And what if they could also view, in real time, which search terms people were using and which websites people were visiting, along with a record of all the search terms people had ever used and all the websites people had ever visited?

What if, with a few keystrokes, they could alter the contents of a dossier to make a competitor look like a pedophile or push their friends’ websites onto the treasured first page of search rankings—maybe even influence the outcome of a close election by making search rankings favor one candidate?

What if you were sitting at that desk and controlling that keyboard? Would you occasionally peek, if only to take a break from the daily grind? Would you do harmless favors for friends and family members? Would you try now and then to help the world in some small way?

Of course you would.

Pertinent here is a little-known 2011 book that presents a fictionalized account of the lives of Google software engineers. Written by Shumeet Baluja, a senior staff research scientist at Google’s headquarters in Mountain View, California, The Silicon Jungle tells an unsettling story about Stephen Thorpe, an over-the-hill programmer who competes with thousands of college-age geniuses to land one of those coveted summer internships at. . . well, Google, really, even though Baluja pretends it’s not. I’ll call it “eGoogle,” for “ersatz Google.”

Unlike Dave Egger’s recent Google-bashing novel, The Circle, Jungle is by a knowledgeable insider. Baluja has a Ph.D. in computer science from Carnegie Mellon and used to be chief scientist at Lycos. He has also worked for Google for more than ten years, and, he tells me, has been prohibited by the company from talking about his book publicly—which is probably why you’ve never heard of it.

Here are a few important features of the culture at eGoogle, some almost dizzying in their implications:

Interns on the data mining teams necessarily have access to all eGoogle data, as do many other eGoogle employees. They need this access because their job is to find hundreds of thousands of new customers for high-rolling eGoogle customers who sell everything from ulcer medications to umbrellas. To guarantee the sale of just about anything, the interns write programs that scan the emails, search histories, and purchase histories of a billion people.

In other words, they are really mining for gold, although Baluja emphasizes that they see the challenges they are given as academic exercises; they are completely oblivious to the millions of dollars eGoogle rakes in with every new program they write. eGoogle is careful, Baluja says, to house the programmers and the accountants in separate buildings and to make sure they never meet.

What if you could view, in real time, which search terms people were using and which websites people were visiting, along with a record of all the search terms people had ever used and all the websites people had ever visited?

When the programmers are not filling coffers, they are scanning eGoogle’s massive databases for racy emails and pining over the private emails of ex-girlfriends, especially the emails with nude photos attached. When an especially titillating pic turns up, they display it on massive overhead screens for coworkers to admire. At eGoogle, everything is spectacular, even the perversions.

When, one evening, an intern is monitoring his ex-girlfriend’s online chats, he learns of a steamy party not far from the eGoogle campus, and four of the most lecherous of the interns are off and running. (We later learn who got lucky.)

Back on “campus,” one intern is made a full-time employee on the spot when he develops an app that allows eGoogle employees to zoom in on neighborhoods using eGoogle’s version of Google Earth and view, house by house, what kinds of activities people are engaged in online. When people are emailing, the houses light up brown; when they’re viewing porn, the houses light up pink. The app is an instant sensation among eGoogle employees, who are eager to see, godlike, which of their neighbors are being naughty.

Stephen’s girlfriend, meanwhile, is trying to get her doctorate at Brown University by monitoring conversations of radical Islamists on a website she’s created, but she can’t get any traffic. After Stephen mentions her dilemma to a fellow intern—one on a search-engine team—her website suddenly jumps up a gazillion slots in eGoogle search rankings, and thousands of prospective suicide bombers sign on.

The plot thickens when Stephen starts mining data for an executive at a nonprofit organization. The executive asks him to search eGoogle’s data for innocent people who are likely to turn up mistakenly on government watch lists, claiming his organization is going to help them protect themselves from overzealous government bureaucrats. Using the same techniques he uses to find widget buyers, Stephen quickly generates a list of 5,000 ideal watch-list candidates, which the scheming executive promptly sells for a seven-figure sum to Arab terrorists.

When government spooks figure out what Stephen has been doing, they make him an offer he can’t refuse: a lifetime of indentured servitude at a secret government research facility where there is no free food, the computers are clunky, and the databases are pathetically small. It’s either that or prison. How, he muses at the end of the book, can the United States ever win the war on terror when the government’s data processing resources are so paltry?

That’s the only thing Baluja gets wrong. Snowden’s revelations about the NSA’s access to the databases of Google, Microsoft, and other companies were inconceivable when The Silicon Jungle was published. Baluja tells us about individual eGoogle employees who routinely feed data to the feds, but he insists that large-scale data sharing would never be allowed by eGoogle executives. Snowden’s disclosures, which apparently are still not complete, remind us that no digital data are ever really private—that data are always vulnerable to the wiles of determined individuals or organizations.

There is also one aspect of Baluja’s tale that is ludicrous on its face, and that is his disclaimer in the book’s introduction that eGoogle (which he actually calls “Ubatoo”) isn’t really Google. In both form and function, it certainly looks like Google, and Baluja also acknowledges that the “temptations, . . . ability, brains, and computational power” necessary to do the kinds of mischief he describes are real.

Google’s privacy violations vary from the petty and mundane to the truly spectacular. On the mundane side, in July 2010, a careless twenty-seven-year-old software engineer named David Barksdale was fired by Google for spying on at least four underage teens through their various Google accounts. While still employed, according to a September 2010 report by Gawker.com, he showed a friend the power he had over private information by pulling up his friend’s “email account, contact list, chat transcripts, Google Voice call logs—even a list of other Gmail addresses that the friend had registered but didn’t think were linked to his main account—in seconds.”

Why isn’t your massive personal profile heavily encrypted so prying eyes can’t see it—or at least not “in seconds”? Public statements in recent years by senior Google employees explain why. According to Vint Cerf, Google’s Chief Internet Evangelist, “we couldn’t run our system if everything in it were encrypted because then we wouldn’t know which ads to show you.” In other words, their business model depends on lightning-quick connections between the insecure data in your digital filing cabinet and the ads supplied by their paying customers, which makes your data easy pickings for rutting David Barksdales.

In July 2010, a careless twenty-seven-year-old software engineer named David Barksdale was fired by Google for spying on at least four underage teens through their various Google accounts.

I have been a programmer most of my life and a research psychologist for more than thirty years. I can tell you with certainty that the kinds of questionable activities Baluja describes are not only plausible but inevitable in the hyper-casual high-tech environment Google maintains, no matter what internal rules may be in place. Google openly takes pride in hiring independent thinkers and letting them frolic; software engineers are officially allowed to play a whopping 20 percent of their work time. In that kind of world, anyone with sufficient password authority or technical expertise can do exactly the kinds of things Baluja depicts—and worse.

When it was revealed in 2010 that Google Street View vehicles had been secretly collecting personal information from personal Wi-Fi networks in more than thirty countries for several years, the company claimed that this was a pet project of a single software engineer—Marius Milner. Although outed in 2012, Milner, who identifies his profession as “hacker” on LinkedIn, is still employed by Google.

Its public denials notwithstanding, Google has, from the top down, consistently shown little respect for privacy. In 2009, Google’s CEO, Eric Schmidt, expressed the philosophy that drives the enterprise: “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place.” In other words, all information is fair game.

This helps explain projects like Buzz, a social network service that Google unveiled in 2010. Without anyone’s permission, the company instantly created an online community intended to overwhelm Facebook, just as its Gmail rollout had quickly overwhelmed Hotmail and Yahoo. The company created Facebook-like pages, already complete with friends, based on who its Gmail users—175 million of them at the time—emailed most frequently. People were so outraged by this blatant incursion into their personal lives that the platform was shut down after eighteen months.

Perhaps more outrageous, it was revealed in 2012 that Google engineers had for several years been hacking into Apple’s Safari browser, allowing them to surreptitiously monitor the search activities of millions of iPhone, iPad, and Mac users. For that little caper, Google was fined $22.5 million by the FTC—the largest fine in the agency’s history.

Comments like Schmidt’s and projects like Buzz are indicative of a kind of culture that both nurtures and encourages daring exploits like the ones in Baluja’s book, with each swashbuckling employee acting out mischievous impulses, large and small, every day. Google offers its employees an unconstrained world that is rich in resources in order to maximize creativity, and research on the creative process that I have been conducting since the 1980s shows that Google is doing things exactly right in this regard. But that kind of culture also magnifies just about every human tendency you can think of, including voyeurism, grandiosity, and greed.

How many David Barksdales and Marius Milners—individuals with the power to alter the course of a life or an industry in seconds—are, at this very moment, toying with Google’s massive, unregulated databases in ways we cannot even imagine? How many of them are getting ideas?

Computers are programmed and controlled by people, and nothing will ever change that simple fact. As soon as a Google computer “scans” an email—an obfuscating word for “reads”—any Google employee with sufficient password authority or technical savvy can too. When Google’s computers track and monitor people’s internet activities, so can its employees.

How much private information does Google have about you? They know, but they won’t say.

How is that information being used by the brash young techies the company takes such pride in hiring? Most likely, exactly as they please.

And how might their use of that information affect our lives in the future? No one knows—not even Google’s top executives—but Murphy’s law probably applies.

Robert Epstein is Senior Research Psychologist at the American Institute for Behavioral Research Technology in Vista, CA, and the former editor-in-chief of Psychology Today magazine. A Ph.D. of Harvard University, he has published fifteen books and more than 250 articles on artificial intelligence, creativity, and other topics.