A key thread of the narrative over the last three days has centered on the alleged impact of the company's data analysis on the 2016 presidential election with the undertone that if true, the company's actions somehow represent something new and ...and more »
Kalev Leetaru , Contributor I write about the broad intersection of data and society. Opinions expressed by Forbes Contributors are their own.
U.S. President Barack Obama and Mark Zuckerberg. (David Paul Morris/Bloomberg)
As I wrote earlier today, the story of Cambridge Analytica that the press, public and elected officials seem to have fixated on is that of a rogue company run amok with breached data that manipulated unwitting Americans into electing the candidate of the company’s choice (the company denies all of the allegations). A key thread of the narrative over the last three days has centered on the alleged impact of the company’s data analysis on the 2016 presidential election with the undertone that if true, the company’s actions somehow represent something new and unsettling in using data to advance a political campaign. To add a bit of perspective to this debate, it is worth looking back at two key ways in which the Obama campaign pioneered the modern data-driven campaign that is at the center of the Cambridge Analytica debate.
At the time of his election and reelection, Obama’s data analytics researchers were heralded as technology heroes for the way they modernized how political campaigns wrangle data in the pursuit of votes. Outlets sang their praises as “digital masterminds” and lauded their “unorthodox” approaches.
One highly publicized innovation was the construction of precision television viewership models that allowed the Obama campaign to precisely model private viewership habits of Americans: “The team bought detailed data on TV viewing by millions of cable subscribers, showing which channels they were watching, sometimes on a second-by-second basis. The information — which is collected from set-top cable boxes and sold by a company called Rentrak — doesn’t show who was watching, but the campaign used a third-party company to match viewing data to its own internal list of voters and poll responses.”
In short, the campaign was able to heavily optimize its advertising efforts by quite literally reaching into the privacy of Americans’ living rooms and understanding what they were watching second by second. While the data didn’t offer address-level resolution, as the Post description above notes, it was sufficient for the campaign to generate exquisitely high-resolution advertising models that achieved up to 20% greater efficiency.
Yet, perhaps of greatest relevance to the controversy surrounding Cambridge Analytica is how the Obama campaign leveraged Facebook. As Carol Davidsen, former Director of Integration of Media Analytics for Obama for America put it last night in a series of tweets reflecting back on the 2012 campaign: “Facebook was surprised we were able to suck out the whole social graph, but they didn’t stop us once they realized that was what we were doing. They came to office in the days following election recruiting & were very candid that they allowed us to do things they wouldn’t have allowed someone else to do because they were on our side.” Yet, she caveated the campaign’s use of the data noting that the project “felt creepy” but that they “played by the rules.”
A New York Times Magazine profile of the time offers a bit more detail how the Obama campaign’s platform worked and it is strikingly similar to the system Facebook claims was used by Cambridge Analytica. As the Times describes it, the campaign “started with a list that grew to a million people who had signed into the campaign Web site through Facebook. When people opted to do so, they were met with a prompt asking to grant the campaign permission to scan their Facebook friends lists, their photos and other personal information. In another prompt, the campaign asked for access to the users’ Facebook news feeds” which 75% permitted and “once permission was granted, the campaign had access to millions of names and faces they could match against their lists of persuadable voters, potential donors, unregistered voters and so on.”
According to one staffer who was involved with the project, next “it would take us 5 to 10 seconds to get a friends list and match it against the voter list … [next] we would grab the top 50 you were most active with and then crawl their wall … we asked to see photos but really we were looking for who were tagged in photos with you, which was a really great way to dredge up old college friends — and ex-girlfriends.”
As the Times put it, the massive exporting of private user data triggered repeated alarms at Facebook due to the volume of profile data going out the door, but that “in each case the company was satisfied the campaign was not violating its privacy and data standards.” In all, through its data efforts, the campaign ended up with a database of 15 million persuadable voters.
In short, according to the Times’ reporting, which is borne out by many other reports of the time, the Obama campaign engaged in nearly identical activity to what Cambridge Analytica is claimed to have done: they took a set of users who willingly contributed their data to a cause and quietly mined their friend lists, downloading immense volumes of private material from unwitting individuals that never authorized, let alone had any idea, that a political campaign was harvesting their information from Facebook simply because a person they were connected with had given the organization permission to harvest their information.
In Obama’s case, the original contributors at least explicitly knew they were contributing to a campaign effort, even if their millions of unwitting friends had no idea their private information was being harvested to attempt to sway their voting behavior. In Cambridge Analytica’s case, users knew only that they were contributing to an academic research project, but the line between academia and the corporate world is ever more blurred in the data world and it is routine for academic institutions to engage in corporate-supported research using data owned by the institution in the support of commercial agendas. Indeed, the claims that Facebook data was collected for academic research and then made available to a commercial enterprise are hardly unsurprising for anyone familiar with the processes and procedures at most top US research universities, especially their corporate funded research and their licensing and commercialization programs.
Putting this all together, both Cambridge Analytica and the Obama campaign are claimed to have harvested information about millions of users from Facebook by starting with an initial seed list of users who granted permission to harvest their friends lists, which were then used to mass export available information on many millions of unwitting users who had never authorized their data to be accessed nor were they even aware of its export. The Obama campaign even appears to have mined wall photographs to identify who each user was tagged with to understand who were close friends and who were merely casual acquaintances, looking for “old college friends and ex-girlfriends.” The only difference appears to be that in the case Cambridge Analytica case, Facebook claims that the data was gathered for academic research and then made available for campaigning, while in the Obama case the campaign was in charge of data collection from the start. Given the academic tradition, at least in the US, of corporate-funded research, it is likely Cambridge Analytica could easily have simply funded the necessary research directed at a university to ensure all usage was still considered to be academic in nature and avoided the whole controversy, to begin with. After all, in the case of the myPersonality project all users are required to register as “collaborators” and at least one Facebook data scientist is listed, suggesting the company has not historically been adverse to the sharing of bulk extracted datasets for research (though myPersonality appears to exclusively contain user submitted data, rather than bulk friend harvesting).
Facebook did not respond to requests for comment on why it saw the Obama campaign’s use of its data as acceptable while it believes Cambridge Analytica’s use was not and specifically what distinctions it sees between the two use cases nor did it respond to a request for comment on whether it would be requesting that other researchers and organizations with large holdings of Facebook data restrict access to them or delete them entirely. In the end, will this mark business as usual, with a burst of negative press and when the dust settles no change or will this somehow change the social media landscape? At the very least, it will be interesting to see whether we still hold up our social media data miners as “digital masterminds” and heroes modernizing campaigning, or whether like the three letter agencies that similarly mine social media, whether they will go underground and in future work quietly to ply their trade in the shadows.
elections 2018 elections in hungary elections elections in europe elections in italy elections in poland elections in the uk elections serbia elections 2018 europe elections in czech republic