Facebook on Thursday released a dataset for AI researchers that includes a diverse group of people who were paid to provide the company with information about age and gender.
About 3,011 people participated and made 45,186 video recordings with an average length of one minute. The data is being provided as open access.
Attempting to eliminate bias, the open-source dataset aims to help researchers judge whether AI systems work well for people of different ages, genders, and skin tones, and in various types of light. The technology is often criticized for being biased.
Studies show that people submitting job resumes with slight cues of older age such as old-fashioned names get fewer callbacks, and educated Black men are sometimes mistakenly remembered as having lighter skin.
IBM and others have said for years that these biases can make their way into data used to train AI systems, amplifying unfair stereotypes and leading to potentially harmful consequences.
Facebook’s dataset, Casual Conversations, includes information on age and gender as well as skin tone, provided by trained annotators using the Fitzpatrick scale, to help researchers evaluate their computer vision and audio models for accuracy along these groups.
It is designed to help researchers evaluate their computer vision and audio models for accuracy across certain attributes. The trained annotators labeled videos with ambient lighting conditions to help measure how models treat people with various skin tones under low-light ambient conditions.
The Facebook team consulted with the company’s Responsible AI team to leverage the original video recordings created by Facebook for the Deepfake Detection Challenge (DFDC).
The Casual Conversations dataset is composed of the same group of paid people Facebook previously used when it commissioned the creation of Deepfake videos for another open-source dataset
They also used the standardized Fitzpatrick scale to label the apparent skin tone of each participant. Uniform distributions of the labels help identify unbalanced distributed errors in measurements, and allow researchers to surface potential algorithmic biases.