Response to the CFPB's Request for Information on Data Brokers
JUNE 8, 2023 | TEAM COWORKER
Coworker welcomes this public consultation by the Consumer Financial Protection Bureau (CFPB) on data brokers and the practice of collecting and selling consumer information. Coworker.org is a laboratory for workers to experiment with power-building strategies and win meaningful changes in the 21st-century economy. For the past four years, Coworker has been conducting research and providing analysis for the field on the impact of technological changes in the workplace and specifically, the intersections of labor and tech policy. We’ve released a framework for understanding how data-mining techniques innovated in the consumer realm have moved into the workplace. The past year, we have been investigating and documenting the increasing number of tech products and tech companies interfacing with every step of the labor process — hiring/recruitment, workplace safety and productivity, benefit provision, workforce development, et al. Dubbing this tech ecosystem as “Little Tech”, we launched a public database to highlight what was a quickly proliferating and unregulated marketplace of tech products increasingly collecting, aggregating, and analyzing sensitive data about workers. We'd like to specifically comment on market-level dynamics that affect the collection of workers' employment data and combination with consumer data which can be used for background checking, identify verification, and hiring and recruitment. A pattern that ad brokers normalized with the broad collection and aggregation of consumer data in the late 1990s and 2000s, is now happening with labor and employment data. Below is a sample of problematic vendor practices that we have uncovered through our research and conversation with workers.
We'd like to specifically comment on market-level dynamics that affect the collection of workers' employment data and combination with consumer data which can be used for background checking, identify verification, and hiring and recruitment. A pattern that ad brokers normalized with the broad collection and aggregation of consumer data in the late 1990s and 2000s, is now happening with labor and employment data. Below is a sample of problematic vendor practices that we have uncovered through our research and conversation with workers.
1 - Aggregating consumer information to be integrated into workforce management platforms:
In our research and analysis we have identified a set of vendors who aggregate consumer data from public records and integrate it into workforce management platforms. We have also found that some of these vendors are acquired by larger data brokers such as Equifax or Thomas Reuters. Below are some of the vendors we’ve found to have employed this business model:
- Appriss Insights (acquired by Equifax): Prior to its acquisition in 2021, Appriss Insights administered the nation's most comprehensive source of person-based incarceration, justice, and risk intelligence data. After its acquisition by Equifax, it was integrated into the Equifax Total Verify platform which among different solutions, includes a workforce management solution for “Workplace Safety” screening and employment verification.
- CLARO: Claro describes itself as a “global labor market intelligence platform” collecting and aggregating billions of data points to benchmark worker attrition risk and worker engagement. They seem to be affiliated with the Human Data Interaction Project at MIT. But other than that, it is unclear their process for aggregating employment data from public records and how that data is modeled into predictive tools for employers.
- Veratio Cerebral (acquired by Awareness Technologies): Veriato Cerebral is a user behavior analytics and insider threat management solution that’s powered by machine learning algorithms. It monitors employee chats, emails, web surfing, and file transfers and uses other data to develop a Risk Score profile for each worker that is updated daily. We are not sure what other data may be integrated into their Risk Profile of workers. More here on their proprietary Risk Profile: https://www.veriato.com/products/veriato-cerebral-insider-threat-detection-software.
- eLoomina: It is unclear if this company is still operating. But at one point they were aggregating a variety of data points from workers such as job autonomy, and consumer related data points such as financial pressure, family issues to develop a risk model that can predict HR risk for employers. They claim the deployment of their data is secured and anonymized. More here about their risk modeling: https://www.eloomina.com/how-it-works.
2 - Data brokers scraping public records to build sourcing, hiring, and recruitment solutions.
In our research we have also identified vendor solutions that aggregate public data on workers in order to help employers conduct reputation and risk assessments on potential candidates. The vendors we’ve identified that employ these practices are:
- Verensics: Their Human Resource Solution includes the use of a “Visual Risk Index” to help employers weed out potential new employees in their “areas of concern” at the screening stage. Their proprietary algorithm appears to create a unique profile for each candidate and multiple data points are analyzed in real-time. It is unclear whether public records data is being used in the modeling.
- Truework: provides employment verification services. Truework has a database of employment records for 40 million U.S. workers. If a worker is not in its database, Truework obtains verified employment information from former employers. More information is needed to understand how this platform collects and aggregates public data.
- Checkr: Checkr claims to use "AI-powered technology to help our customers run their background checks so companies—big or small—can make safer, more informed hiring decisions in less time". It allows employers to verify important candidate data such as education, employment, and licensing history and get insight into a candidate’s accounts including late payments, charge-offs, and collections.
- EightfoldAI: offers the best way for organizations to retain talent, recruit efficiently, and boost diversity, all with deep learning AI. Eightfold advertises itself as a recruitment and talent retention platform Their website states, “It is impossible to accurately convey a candidate’s experience on a resume, resulting in candidate profiles that lack true assessments of potential. Powered by the world’s largest source of talent data, Eightfold fills in the gaps by identifying Validated Skills, Likely Skills, Skills to Be Validated, and Missing Skills. They add that their “Equal Opportunity Algorithms” ensure that each prediction does not create bias against personal or demographic characteristics.
3 - Employment data brokers and labor intelligence products.
Beyond the major consumer data brokers (e.g. Experian, TransUnion, Equifax, etc), there is a new generation of data vendors which are aggregating public records and online information on consumers and workers in order provide a variety of solutions ranging from sales acquisition, marketing, and labor market intelligence products.
- Apollo.io: Apollo.io is a sales intelligence platform that hosts a database of more than 220 million contacts from 29 million companies, accessed by over 9,000 paying customers, which includes startups like Lyft, Peloton, and Gympass as well as Fortune 500 giants. The company says it uses advanced algorithms and data acquisition methods to provide business attributes and contact information on prospects; display this information automatically when visiting LinkedIn profiles; enrich CRM databases with more than 200 unique business attributes, and flag new contact information in real-time if prospects change jobs or get promoted. In a 2021 article, the CEO Tim Zheng was quoted as saying: “We crawl the public web and index and synthesize tons of information like technologies used, different keywords, websites, etc. Plus, we also have a contributory network of users who opt in to share their data in exchange for using our free product. Our algorithm scrapes PR news releases, news sites, and information from corporate websites. Profiles are then compiled and continually updated by our software(1)”.
- Narrative.io: is a standardized data provider that also provides solutions for employers to buy data from other data brokers through their Datastreams Marketplace: https://www.narrative.io/data-marketplace. They themselves are a databroker currently collecting "trillions of raw data points from 40+ data providers with just one integration". Some of that data includes business and behavioral data points (see Claritas one of their data partners).
- AnalyticsIQ: AnalyticsIQ is an offline marketing data creator and predictive analytics innovator. Among being a traditional consumer data broker, they also aggregate business data which combines both professional and individual level attributes that can be purchased by anyone. Such as for their professional data points, they collect information on workers title, role, drivers, messaging preferences and combine that with individual level data points such as behaviors, demographic, finances, and motivations.
- Stirista: is an offline compiler of consumer and business data. The Stirista consumer database consists of essentially every U.S. adult. Deterministic compilation results in precise postal data (99.7% CASS score), opt-in email coverage on 120MM +, and social media handles on over 190 MM. This data is consistently validated via transaction activity. The Stirista business database consists of 30 MM professionals and is constantly validated against social media and corporate profiles.
- Emsi (now called Lightcast): is a labor market analytics firm that harvests “professional profiles” from social media and “traditional labor market information” in order to offer labor market analytics such as compensation data, job posting data, and market data.
4 - Combat Fraud
- Pondera Solutions: It was acquired by Thomas Reuters in 2020 and is now called Fraud Detect solution for Thomas Reuters. This new solution under Thomas Reuters is now used to detect unemployment scams. More information is needed regarding its machine learning modeling and collecting of data.
In summary, we believe these trends are just the tip of the iceberg in what’s increasing blurring of the lines between the aggregation and processing of consumer and employment data. A growing focus on artificial intelligence and machine learning technologies has intensified the search for increasingly sophisticated and untested predictive models that are robust to the processing of tens or hundreds of data points in most cases. Accordingly, we request that the Consumer Financial Protection Bureau review what enforcement authority they may have under the Fair Credit Reporting Act (FCRA).
Thank you for the opportunity to provide these comments.
Wilneida Negrón, PhD
Director of Research and Policy