Face recognition systems are everywhere, from security cameras trying to throw criminals on the way Snapchat finds your face to put rabbit ears on it. Computers need a lot of data to learn how to recognize faces, and some of it comes from Flickr.
IBM launched a dataset "Diversity in Faces" earlier this year, which is undoubtedly a good thing: many early-face recognition algorithms were trained on thin white celebrities, because it is easy to find a lot of celebrity images. Your data source affects what your algorithm can do and understand, so there are many racist, sexist algorithms out there. This data set is intended to help by providing images of faces along with face data, such as skin color.
But most people who uploaded their personal snapshots to Flickr didn't know for sure that the years down the road, their faces and their friends' and family's faces could be used to train the next big mega algorithm. If you have used a Creative Commons license for your photos, even a "non-commercial", you may be in this dataset.
NBC reports that IBM says it will remove images from the dataset on the photographer's or photographer's request ̵
Source link