Explainer: What do political databases know about you?

American citizens are inundated with political messages—on social networks, in their news feeds, through email, text messages, and phone calls. It’s not an accident that people get bombarded: political groups prefer a “multimodal” voter contact strategy, where they use many platforms and multiple attempts to persuade a citizen to engage with their cause or candidate. An ad is followed by an email, which is followed by a text message—all designed to reinforce the message.

These strategies are employed by political campaigns, political action committees, advocacy groups, and nonprofits alike. These different groups are subject to very different rules and regulations, but they all rely on capturing and devouring data about millions of people in America. 

Who is in these data sets?

Almost everyone. Most campaigns get their voter information from a handful of data vendors, either nonpartisan or partisan. These companies try to provide data on all US adults, regardless of whether they are registered voters. It’s unlikely that an individual vendor has comprehensive files on all eligible US voters, but the Pew Research Center, which released a report on commercial voter files in 2018, found that over 90% of people in its own sample of US adults could be found on at least one registry.

What data is collected and where does it come from?

The main source of voter data is public voting records, which include a voter’s names, address, and party affiliation. But voter data is very patchy and decentralized: each state holds its own database, and they often have different attributes. So vendors supplement it with other sources, like phone books and credit data. 

It’s hard to get a full picture of everything that is fed into the vendors’ databases: the recipe each one uses is usually considered a trade secret. Pew’s study explained that the registries are “an amalgamation of administrative data from states about registration and voting, modeled data about partisanship, political engagement and political support provided by vendors; and demographic, financial and lifestyle data culled from a wide range of sources.” 

Data vendors attempt to match up and reconcile these different data sets to create one comprehensive record for each person in the US based on key identifiers like name, address, gender, and date of birth.

L2 is one of the largest companies trading in this information, and it claims to have more than 600 data attributes pulled from census data, emails from commercial sources, donor data sets, and more. Experts say that most vendors provide hundreds of data points about each voter. 

How accurate are these voter databases? 

It’s up for debate. Some data points are very accurate, but others are really just predictions or guesses. Party and race, for example, are often inferred on the basis of someone’s name and location. Somebody with the last name Ryan is assumed to be white, while somebody in a heavily Republican district is assumed to be a Republican voter. 

The accuracy of specific attributes varies a lot: Pew found that race was accurate 79% of the time, education 51%, and religion 52%. Household income, meanwhile, was accurate just 37% of the time. There was also measurable bias, with higher error rates for younger, highly mobile, unregistered, and Hispanic voters.