The company extracted nearly one million photos from a dataset of Flickr images originally compiled by Yahoo.
But many people pictured were probably unaware of how their data had been used, according to an NBC News report.
IBM said in a statement that it had taken great care to comply with privacy principles.
But some of the photographers whose images were included in IBM’s dataset were surprised when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms.
Despite IBM’s assurances that Flickr users can opt out of the database, NBC News discovered that it’s almost impossible to get photos removed.
IBM requires photographers to email links to photos they want removed, but the company has not publicly shared the list of Flickr users and photos included in the dataset, so there is no easy way of finding out whose photos are included. IBM did not respond to questions about this process.
Photos selected by IBM were listed under a Creative Commons licence, which generally means the images can be widely used with only a small number of restrictions.
In a paper published online about the work, IBM researchers describe in detail the steps taken to analyse people’s faces, including taking measurements of the distance between individuals’ facial features.
“Many of these measures can be reliably estimated from photos of frontal faces, using 47 landmark points of the head and face,” the researchers wrote.
This analysis helps artificial neural networks to learn how to distinguish between faces, so that individuals can be recognised in different images.